mariadb/storage/maria/ma_recovery.c

3049 lines
97 KiB
C
Raw Normal View History

WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
/* Copyright (C) 2006, 2007 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
GPL license update (same change as was done for all files in 5.1). storage/maria/Makefile.am: GPL license update storage/maria/ft_maria.c: GPL license update storage/maria/ha_maria.cc: GPL license update storage/maria/ha_maria.h: GPL license update storage/maria/lockman.c: GPL license update storage/maria/lockman.h: GPL license update storage/maria/ma_bitmap.c: GPL license update storage/maria/ma_blockrec.c: GPL license update storage/maria/ma_blockrec.h: GPL license update storage/maria/ma_cache.c: GPL license update storage/maria/ma_changed.c: GPL license update storage/maria/ma_check.c: GPL license update storage/maria/ma_checkpoint.c: GPL license update storage/maria/ma_checkpoint.h: GPL license update storage/maria/ma_checksum.c: GPL license update storage/maria/ma_close.c: GPL license update storage/maria/ma_control_file.c: GPL license update storage/maria/ma_control_file.h: GPL license update storage/maria/ma_create.c: GPL license update storage/maria/ma_dbug.c: GPL license update storage/maria/ma_delete.c: GPL license update storage/maria/ma_delete_all.c: GPL license update storage/maria/ma_delete_table.c: GPL license update storage/maria/ma_dynrec.c: GPL license update storage/maria/ma_extra.c: GPL license update storage/maria/ma_ft_boolean_search.c: GPL license update storage/maria/ma_ft_eval.c: GPL license update storage/maria/ma_ft_eval.h: GPL license update storage/maria/ma_ft_nlq_search.c: GPL license update storage/maria/ma_ft_parser.c: GPL license update storage/maria/ma_ft_stem.c: GPL license update storage/maria/ma_ft_test1.c: GPL license update storage/maria/ma_ft_test1.h: GPL license update storage/maria/ma_ft_update.c: GPL license update storage/maria/ma_ftdefs.h: GPL license update storage/maria/ma_fulltext.h: GPL license update storage/maria/ma_info.c: GPL license update storage/maria/ma_init.c: GPL license update storage/maria/ma_key.c: GPL license update storage/maria/ma_keycache.c: GPL license update storage/maria/ma_least_recently_dirtied.c: GPL license update storage/maria/ma_least_recently_dirtied.h: GPL license update storage/maria/ma_locking.c: GPL license update storage/maria/ma_open.c: GPL license update storage/maria/ma_packrec.c: GPL license update storage/maria/ma_page.c: GPL license update storage/maria/ma_panic.c: GPL license update storage/maria/ma_preload.c: GPL license update storage/maria/ma_range.c: GPL license update storage/maria/ma_recovery.c: GPL license update storage/maria/ma_recovery.h: GPL license update storage/maria/ma_rename.c: GPL license update storage/maria/ma_rfirst.c: GPL license update storage/maria/ma_rkey.c: GPL license update storage/maria/ma_rlast.c: GPL license update storage/maria/ma_rnext.c: GPL license update storage/maria/ma_rnext_same.c: GPL license update storage/maria/ma_rprev.c: GPL license update storage/maria/ma_rrnd.c: GPL license update storage/maria/ma_rsame.c: GPL license update storage/maria/ma_rsamepos.c: GPL license update storage/maria/ma_rt_index.c: GPL license update storage/maria/ma_rt_index.h: GPL license update storage/maria/ma_rt_key.c: GPL license update storage/maria/ma_rt_key.h: GPL license update storage/maria/ma_rt_mbr.c: GPL license update storage/maria/ma_rt_mbr.h: GPL license update storage/maria/ma_rt_split.c: GPL license update storage/maria/ma_rt_test.c: GPL license update storage/maria/ma_scan.c: GPL license update storage/maria/ma_search.c: GPL license update storage/maria/ma_sort.c: GPL license update storage/maria/ma_sp_defs.h: GPL license update storage/maria/ma_sp_key.c: GPL license update storage/maria/ma_sp_test.c: GPL license update storage/maria/ma_static.c: GPL license update storage/maria/ma_statrec.c: GPL license update storage/maria/ma_test1.c: GPL license update storage/maria/ma_test2.c: GPL license update storage/maria/ma_test3.c: GPL license update storage/maria/ma_unique.c: GPL license update storage/maria/ma_update.c: GPL license update storage/maria/ma_write.c: GPL license update storage/maria/maria_chk.c: GPL license update storage/maria/maria_def.h: GPL license update storage/maria/maria_ftdump.c: GPL license update storage/maria/maria_pack.c: GPL license update storage/maria/tablockman.c: GPL license update storage/maria/tablockman.h: GPL license update storage/maria/trnman.c: GPL license update storage/maria/trnman.h: GPL license update
2007-03-02 11:20:23 +01:00
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
/*
WL#3072 Maria recovery
First version written by Guilhem Bichot on 2006-04-27.
*/
/* Here is the implementation of this module */
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
#include "maria_def.h"
#include "ma_recovery.h"
#include "ma_blockrec.h"
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
#include "ma_checkpoint.h"
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
#include "trnman.h"
#include "ma_key_recover.h"
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
struct st_trn_for_recovery /* used only in the REDO phase */
{
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
LSN group_start_lsn, undo_lsn, first_undo_lsn;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
TrID long_trid;
};
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
struct st_dirty_page /* used only in the REDO phase */
{
uint64 file_and_page_id;
LSN rec_lsn;
};
struct st_table_for_recovery /* used in the REDO and UNDO phase */
{
MARIA_HA *info;
File org_kfile, org_dfile; /**< OS descriptors when Checkpoint saw table */
};
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
/* Variables used by all functions of this module. Ok as single-threaded */
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
static struct st_trn_for_recovery *all_active_trans;
static struct st_table_for_recovery *all_tables;
static HASH all_dirty_pages;
static struct st_dirty_page *dirty_pages_pool;
static LSN current_group_end_lsn,
checkpoint_start= LSN_IMPOSSIBLE;
* WL#4137 Maria- Framework for testing recovery in mysql-test-run See test maria-recovery.test for a model; all include scripts have an "API" section at start if they do take parameters from outside. * Fixing bug reported by Jani and Monty (when two REDOs about the same page in one group, see ma_blockrec.c). * Fixing small bugs in recovery mysql-test/include/wait_until_connected_again.inc: be sure to enter the loop (the previous query by the caller may not have failed: it could be query; mysqladmin shutdown; call this script). mysql-test/lib/mtr_process.pl: * Through the "expect" file a test can tell mtr that a server crash is expected. What the file contains is irrelevant. Now if its last line starts with "wait", mtr will wait before restarting (it will wait for the last line to not start with "wait"). This is for tests which need to mangle files under the feet of a dead mysqld. * Remove "expect" file before restarting; otherwise there could be a race condition: tests sees server restarted, does something, writes an "expect" file, and then mtr removes that file, then test kills mysqld, and then mtr will never restart it. storage/maria/ma_blockrec.c: - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - fixing bug in applying of REDO_PURGE_BLOCKS in recovery: page_range sometimes has TAIL_BIT set, need to turn it down to know the real page range. - Both bugs are covered in maria-recovery.test storage/maria/ma_checkpoint.c: Capability to, in debug builds only, do some special operations (flush all bitmap and data pages, flush state, flush log) and crash mysqld, to later test recovery. Driven by some --debug=d, symbols. storage/maria/ma_open.c: debugging info storage/maria/ma_pagecache.c: Now that we can _ma_unpin_all_pages() during the REDO phase to set page's LSN, the assertion needs to be relaxed. storage/maria/ma_recovery.c: - open trace file in append mode (useful when a test triggers several recoveries, we see them all). - fixing wrong error detection, it's possible that during recovery we want to open an already open table. - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - we verify that all log records of a group are about the same table, for debugging. mysql-test/r/maria-recovery.result: result mysql-test/t/maria-recovery-master.opt: crash is expected, core file would take room, stack trace would wake pushbuild up. mysql-test/t/maria-recovery.test: Test of recovery from mysql-test (it is already tested as unit tests in ma_test_recovery) (WL#4137) - test that, if recovery is made to start on an empty table it can replay the effects of committed and uncommitted statements (having only the committed ones in the end result). This should be the first test for someone writing code of new REDOs. - test that, if mysqld is crashed and recovery runs we have only committed statements in the end result. Crashes are done in different ways: flush nothing (so, uncommitted statement is often missing from the log => no rollback to do); flush pagecache (implicitely flushes log (WAL)) and flush log, both causes rollbacks; flush log can also flush state (state.records etc) to test recovery of the state (not tested well now as we repair the index anyway). - test of bug found by Jani and Monty in recovery (two REDO about the same page in one group). mysql-test/include/maria_empty_logs.inc: removes logs, to have a clean sheet for testing recovery. mysql-test/include/maria_make_snapshot.inc: copies a table to another directory, or back, or compares both (comparison is not implemented as physical comparison is impossible if an UNDO phase happened). mysql-test/include/maria_make_snapshot_for_comparison.inc: copies tables to another directory so that they can later serve as a comparison reference (they are the good tables, recovery should produce similar ones). mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: When we want to force recovery to start on old tables, we prepare old tables with this script: we put them in a spare directory. They are later copied back over mysqltest tables while mysqld is dead. We also need to copy back the control file, otherwise mysqld, in recovery, would start from the latest checkpoint: latest checkpoint plus old tables is not a recovery-possible scenario of course. mysql-test/include/maria_verify_recovery.inc: causes mysqld to crash, restores old tables if requested, lets recovery run, compares resulting tables with reference tables by using CHECKSUM TABLE. We don't do any sanity checks on page's LSN in resulting tables, yet.
2007-11-13 17:12:29 +01:00
#ifndef DBUG_OFF
/** Current group of REDOs is about this table and only this one */
static MARIA_HA *current_group_table;
#endif
static TrID max_long_trid= 0; /**< max long trid seen by REDO phase */
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
static FILE *tracef; /**< trace file for debugging */
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
static my_bool skip_DDLs; /**< if REDO phase should skip DDL records */
WL#3071 - Maria checkpoint * Preparation for having a background checkpoint thread: frequency of checkpoint taken by that thread is now configurable by the user: global variable maria_checkpoint_frequency, in seconds, default 30 (checkpoint every 30th second); 0 means no checkpoints (and thus no background thread, thus no background flushing, that will probably only be used for testing). * Don't take checkpoints in Recovery if it didn't do anything significant; thus no checkpoint after a clean shutdown/restart. The only checkpoint which is never skipped is the one at shutdown. * fix for a test failure (after-merge fix) include/maria.h: new variable mysql-test/suite/rpl/r/rpl_row_flsh_tbls.result: result update mysql-test/suite/rpl/t/rpl_row_flsh_tbls.test: position update (=after merge fix, as this position was already changed into 5.1 and not merged here, causing test to fail) storage/maria/ha_maria.cc: Checkpoint's frequency is now configurable by the user: global variable maria_checkpoint_frequency. Changing it on the fly requires us to shutdown/restart the background checkpoint thread, as the loop done in that thread assumes a constant checkpoint interval. Default value is 30: a checkpoint every 30 seconds (yes, I know, physicists will remind that it should be named "period" then). ha_maria now asks for a background checkpoint thread when it starts, but this is still overruled (disabled) in ma_checkpoint_init(). storage/maria/ma_checkpoint.c: Checkpoint's frequency is now configurable by the user: background thread takes a checkpoint every maria_checkpoint_interval-th second. If that variable is 0, no checkpoints are taken. Note, I will enable the background thread only in a later changeset. storage/maria/ma_recovery.c: Don't take checkpoints at the end of the REDO phase and at the end of Recovery if Recovery didn't make anything significant (didn't open any tables, didn't rollback any transactions). With this, after a clean shutdown, Recovery shouldn't take any checkpoint, which makes starting faster (we save a few fsync()s of the log and control file).
2007-10-09 10:38:31 +02:00
/** @brief to avoid writing a checkpoint if recovery did nothing. */
static my_bool checkpoint_useful;
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
static my_bool procent_printed;
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
static ulonglong now; /**< for tracking execution time of phases */
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
uint warnings; /**< count of warnings */
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
#define prototype_redo_exec_hook(R) \
static int exec_REDO_LOGREC_ ## R(const TRANSLOG_HEADER_BUFFER *rec)
#define prototype_redo_exec_hook_dummy(R) \
static int exec_REDO_LOGREC_ ## R(const TRANSLOG_HEADER_BUFFER *rec \
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
__attribute ((unused)))
#define prototype_undo_exec_hook(R) \
static int exec_UNDO_LOGREC_ ## R(const TRANSLOG_HEADER_BUFFER *rec, TRN *trn)
prototype_redo_exec_hook(LONG_TRANSACTION_ID);
prototype_redo_exec_hook_dummy(CHECKPOINT);
prototype_redo_exec_hook(REDO_CREATE_TABLE);
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
prototype_redo_exec_hook(REDO_RENAME_TABLE);
prototype_redo_exec_hook(REDO_REPAIR_TABLE);
prototype_redo_exec_hook(REDO_DROP_TABLE);
prototype_redo_exec_hook(FILE_ID);
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
prototype_redo_exec_hook(INCOMPLETE_LOG);
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
prototype_redo_exec_hook_dummy(INCOMPLETE_GROUP);
prototype_redo_exec_hook(REDO_INSERT_ROW_HEAD);
prototype_redo_exec_hook(REDO_INSERT_ROW_TAIL);
Merge some changes from sql directory in 5.1 tree Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added redo_free_head_or_tail() & redo_insert_row_blobs() Added uuid to control file maria_checks now verifies that not used part of bitmap is 0 REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL Fixes problem when trying to read block outside of file during REDO include/my_global.h: STACK_DIRECTION is already set by configure mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Test shrinking of VARCHAR mysys/my_realloc.c: Fixed indentation mysys/safemalloc.c: Fixed indentation sql/filesort.cc: Removed some casts sql/mysqld.cc: Added missing setting of myisam_stats_method_str sql/uniques.cc: Removed some casts storage/maria/ma_bitmap.c: Added printing of bitmap (for debugging) Renamed _ma_print_bitmap() -> _ma_print_bitmap_changes() Added _ma_set_full_page_bits() Fixed bug in ma_bitmap_find_new_place() (affecting updates) when using big files storage/maria/ma_blockrec.c: Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added code to fix some cases where redo when using blobs didn't produce idenital .MAD files as normal usage REDO_FREE_ROW_BLOCKS doesn't anymore change pages; We only mark things free in bitmap Remove TAIL and filler extents from REDO_FREE_BLOCKS log entry. (Fixed some asserts) REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Delete tails in update. (Fixed bug when doing update that shrinks blob/varchar length) Fixed bug when doing insert in block outside of file size. Added redo_free_head_or_tail() & redo_insert_row_blobs() Added pagecache_unlock_by_link() when read fails. Much more comments, DBUG and ASSERT entries storage/maria/ma_blockrec.h: Prototypes of new functions Define of SUB_RANGE_SIZE & BLOCK_FILLER_SIZE storage/maria/ma_check.c: Verify that not used part of bitmap is 0 storage/maria/ma_control_file.c: Added uuid to control file storage/maria/ma_loghandler.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_loghandler.h: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_pagecache.c: If we write full block, remove error flag for block. (Fixes problem when trying to read block outside of file) storage/maria/ma_recovery.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_test1.c: Allow option after 'b' to be compatible with ma_test2 (This is just to simplify test scripts like ma_test_recovery) storage/maria/ma_test2.c: Default size of blob is now 1000 instead of 1 storage/maria/ma_test_all.sh: Added test for bigger blobs storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Added test for bigger blobs
2007-10-19 23:24:22 +02:00
prototype_redo_exec_hook(REDO_INSERT_ROW_BLOBS);
prototype_redo_exec_hook(REDO_PURGE_ROW_HEAD);
prototype_redo_exec_hook(REDO_PURGE_ROW_TAIL);
Merge some changes from sql directory in 5.1 tree Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added redo_free_head_or_tail() & redo_insert_row_blobs() Added uuid to control file maria_checks now verifies that not used part of bitmap is 0 REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL Fixes problem when trying to read block outside of file during REDO include/my_global.h: STACK_DIRECTION is already set by configure mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Test shrinking of VARCHAR mysys/my_realloc.c: Fixed indentation mysys/safemalloc.c: Fixed indentation sql/filesort.cc: Removed some casts sql/mysqld.cc: Added missing setting of myisam_stats_method_str sql/uniques.cc: Removed some casts storage/maria/ma_bitmap.c: Added printing of bitmap (for debugging) Renamed _ma_print_bitmap() -> _ma_print_bitmap_changes() Added _ma_set_full_page_bits() Fixed bug in ma_bitmap_find_new_place() (affecting updates) when using big files storage/maria/ma_blockrec.c: Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added code to fix some cases where redo when using blobs didn't produce idenital .MAD files as normal usage REDO_FREE_ROW_BLOCKS doesn't anymore change pages; We only mark things free in bitmap Remove TAIL and filler extents from REDO_FREE_BLOCKS log entry. (Fixed some asserts) REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Delete tails in update. (Fixed bug when doing update that shrinks blob/varchar length) Fixed bug when doing insert in block outside of file size. Added redo_free_head_or_tail() & redo_insert_row_blobs() Added pagecache_unlock_by_link() when read fails. Much more comments, DBUG and ASSERT entries storage/maria/ma_blockrec.h: Prototypes of new functions Define of SUB_RANGE_SIZE & BLOCK_FILLER_SIZE storage/maria/ma_check.c: Verify that not used part of bitmap is 0 storage/maria/ma_control_file.c: Added uuid to control file storage/maria/ma_loghandler.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_loghandler.h: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_pagecache.c: If we write full block, remove error flag for block. (Fixes problem when trying to read block outside of file) storage/maria/ma_recovery.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_test1.c: Allow option after 'b' to be compatible with ma_test2 (This is just to simplify test scripts like ma_test_recovery) storage/maria/ma_test2.c: Default size of blob is now 1000 instead of 1 storage/maria/ma_test_all.sh: Added test for bigger blobs storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Added test for bigger blobs
2007-10-19 23:24:22 +02:00
prototype_redo_exec_hook(REDO_FREE_HEAD_OR_TAIL);
prototype_redo_exec_hook(REDO_FREE_BLOCKS);
prototype_redo_exec_hook(REDO_DELETE_ALL);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
prototype_redo_exec_hook(REDO_INDEX);
prototype_redo_exec_hook(REDO_INDEX_NEW_PAGE);
prototype_redo_exec_hook(REDO_INDEX_FREE_PAGE);
prototype_redo_exec_hook(UNDO_ROW_INSERT);
prototype_redo_exec_hook(UNDO_ROW_DELETE);
prototype_redo_exec_hook(UNDO_ROW_UPDATE);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
prototype_redo_exec_hook(UNDO_KEY_INSERT);
prototype_redo_exec_hook(UNDO_KEY_DELETE);
prototype_redo_exec_hook(UNDO_KEY_DELETE_WITH_ROOT);
prototype_redo_exec_hook(COMMIT);
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
prototype_redo_exec_hook(CLR_END);
prototype_undo_exec_hook(UNDO_ROW_INSERT);
prototype_undo_exec_hook(UNDO_ROW_DELETE);
prototype_undo_exec_hook(UNDO_ROW_UPDATE);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
prototype_undo_exec_hook(UNDO_KEY_INSERT);
prototype_undo_exec_hook(UNDO_KEY_DELETE);
prototype_undo_exec_hook(UNDO_KEY_DELETE_WITH_ROOT);
static int run_redo_phase(LSN lsn, enum maria_apply_log_way apply);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
static uint end_of_redo_phase(my_bool prepare_for_undo_phase);
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
static int run_undo_phase(uint uncommitted);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
static void display_record_position(const LOG_DESC *log_desc,
const TRANSLOG_HEADER_BUFFER *rec,
uint number);
static int display_and_apply_record(const LOG_DESC *log_desc,
const TRANSLOG_HEADER_BUFFER *rec);
static MARIA_HA *get_MARIA_HA_from_REDO_record(const
TRANSLOG_HEADER_BUFFER *rec);
static MARIA_HA *get_MARIA_HA_from_UNDO_record(const
TRANSLOG_HEADER_BUFFER *rec);
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
static void prepare_table_for_close(MARIA_HA *info, TRANSLOG_ADDRESS horizon);
static LSN parse_checkpoint_record(LSN lsn);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
static void new_transaction(uint16 sid, TrID long_id, LSN undo_lsn,
LSN first_undo_lsn);
static int new_table(uint16 sid, const char *name,
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
File org_kfile, File org_dfile,
LSN lsn_of_file_id);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
static int new_page(File fileid, pgcache_page_no_t pageid, LSN rec_lsn,
struct st_dirty_page *dirty_page);
static int close_all_tables(void);
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
static my_bool close_one_table(const char *name, TRANSLOG_ADDRESS addr);
static void print_redo_phase_progress(TRANSLOG_ADDRESS addr);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
/** @brief global [out] buffer for translog_read_record(); never shrinks */
static LEX_STRING log_record_buffer;
static void enlarge_buffer(const TRANSLOG_HEADER_BUFFER *rec)
{
if (log_record_buffer.length < rec->record_length)
{
log_record_buffer.length= rec->record_length;
log_record_buffer.str= my_realloc(log_record_buffer.str,
rec->record_length,
MYF(MY_WME | MY_ALLOW_ZERO_PTR));
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
}
/** @brief Tells what kind of progress message was printed to the error log */
static enum recovery_message_type
{
REC_MSG_NONE= 0, REC_MSG_REDO, REC_MSG_UNDO, REC_MSG_FLUSH
} recovery_message_printed;
/** @brief Prints to a trace file if it is not NULL */
void tprint(FILE *trace_file, const char *format, ...)
ATTRIBUTE_FORMAT(printf, 2, 3);
void tprint(FILE *trace_file __attribute__ ((unused)),
const char *format __attribute__ ((unused)), ...)
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
{
va_list args;
va_start(args, format);
if (trace_file != NULL)
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
{
if (procent_printed)
{
procent_printed= 0;
fputc('\n', trace_file ? trace_file : stderr);
}
vfprintf(trace_file, format, args);
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
}
va_end(args);
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
}
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
void eprint(FILE *trace_file, const char *format, ...)
ATTRIBUTE_FORMAT(printf, 2, 3);
void eprint(FILE *trace_file __attribute__ ((unused)),
const char *format __attribute__ ((unused)), ...)
{
va_list args;
va_start(args, format);
if (procent_printed)
{
/* In silent mode, print on another line than the 0% 10% 20% line */
procent_printed= 0;
fputc('\n', trace_file ? trace_file : stderr);
}
vfprintf(trace_file ? trace_file : stderr, format, args);
va_end(args);
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
#define ALERT_USER() DBUG_ASSERT(0)
WL#3071 Maria checkpoint, WL#3072 Maria recovery instead of fprintf(stderr) when a task (with no user connected) gets an error, use my_printf_error(). Flags ME_JUST_WARNING and ME_JUST_INFO added to my_error()/my_printf_error(), which pass it to my_message_sql() which is modified to call the appropriate sql_print_*(). This way recovery can signal its start and end with [Note] and not [ERROR] (but failure with [ERROR]). Recovery's detailed progress (percents etc) still uses stderr as they have to stay on one single line. sql_print_error() changed to use my_progname_short (nicer display). mysql-test-run.pl --gdb/--ddd does not run mysqld, because a breakpoint in mysql_parse is too late to debug startup problems; instead, dev should set the breakpoints it wants and then "run" ("r"). include/my_sys.h: new flags to tell error_handler_hook that this is not an error but an information or warning mysql-test/mysql-test-run.pl: when running with --gdb/--ddd to debug mysqld, breaking at mysql_parse is too late to debug startup problems; now, it does not run mysqld, does not set breakpoints, developer can set as early breakpoints as it wants and is responsible for typing "run" (or "r") mysys/my_init.c: set my_progname_short mysys/my_static.c: my_progname_short added sql/mysqld.cc: * my_message_sql() can now receive info or warning, not only error; this allows mysys to tell the user (or the error log if no user) about an info or warning. Used from Maria. * plugins (or engines like Maria) may want to call my_error(), so set up the error handler hook (my_message_sql) before initializing plugins; otherwise they get my_message_no_curses which is less integrated into mysqld (is just fputs()) * using my_progname_short instead of my_progname, in my_message_sql() (less space on screen) storage/maria/ma_checkpoint.c: fprintf(stderr) -> ma_message_no_user() storage/maria/ma_checkpoint.h: function for any Maria task, not connected to a user (example: checkpoint, recovery; soon could be deleted records purger) to report a message (calls my_printf_error() which, when inside ha_maria, leads to sql_print_*(), and when outside, leads to my_message_no_curses i.e. stderr). storage/maria/ma_recovery.c: To tell that recovery starts and ends we use ma_message_no_user() (sql_print_*() in practice). Detailed progress info still uses stderr as sql_print() cannot put several messages on one line. 071116 18:42:16 [Note] mysqld: Maria engine: starting recovery recovered pages: 0% 67% 100% (0.0 seconds); transactions to roll back: 1 0 (0.0 seconds); tables to flush: 1 0 (0.0 seconds); 071116 18:42:16 [Note] mysqld: Maria engine: recovery done storage/maria/maria_chk.c: my_progname_short moved to mysys storage/maria/maria_read_log.c: my_progname_short moved to mysys storage/myisam/myisamchk.c: my_progname_short moved to mysys
2007-11-16 17:09:51 +01:00
static void print_preamble()
{
ma_message_no_user(ME_JUST_INFO, "starting recovery");
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
/**
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
@brief Recovers from the last checkpoint.
Runs the REDO phase using special structures, then sets up the playground
of runtime: recreates transactions inside trnman, open tables with their
two-byte-id mapping; takes a checkpoint and runs the UNDO phase. Closes all
tables.
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
@return Operation status
@retval 0 OK
@retval !=0 Error
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
*/
int maria_recover(void)
{
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
int res= 1;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
FILE *trace_file;
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
uint warnings_count;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
DBUG_ENTER("maria_recover");
DBUG_ASSERT(!maria_in_recovery);
maria_in_recovery= TRUE;
#ifdef EXTRA_DEBUG
* WL#4137 Maria- Framework for testing recovery in mysql-test-run See test maria-recovery.test for a model; all include scripts have an "API" section at start if they do take parameters from outside. * Fixing bug reported by Jani and Monty (when two REDOs about the same page in one group, see ma_blockrec.c). * Fixing small bugs in recovery mysql-test/include/wait_until_connected_again.inc: be sure to enter the loop (the previous query by the caller may not have failed: it could be query; mysqladmin shutdown; call this script). mysql-test/lib/mtr_process.pl: * Through the "expect" file a test can tell mtr that a server crash is expected. What the file contains is irrelevant. Now if its last line starts with "wait", mtr will wait before restarting (it will wait for the last line to not start with "wait"). This is for tests which need to mangle files under the feet of a dead mysqld. * Remove "expect" file before restarting; otherwise there could be a race condition: tests sees server restarted, does something, writes an "expect" file, and then mtr removes that file, then test kills mysqld, and then mtr will never restart it. storage/maria/ma_blockrec.c: - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - fixing bug in applying of REDO_PURGE_BLOCKS in recovery: page_range sometimes has TAIL_BIT set, need to turn it down to know the real page range. - Both bugs are covered in maria-recovery.test storage/maria/ma_checkpoint.c: Capability to, in debug builds only, do some special operations (flush all bitmap and data pages, flush state, flush log) and crash mysqld, to later test recovery. Driven by some --debug=d, symbols. storage/maria/ma_open.c: debugging info storage/maria/ma_pagecache.c: Now that we can _ma_unpin_all_pages() during the REDO phase to set page's LSN, the assertion needs to be relaxed. storage/maria/ma_recovery.c: - open trace file in append mode (useful when a test triggers several recoveries, we see them all). - fixing wrong error detection, it's possible that during recovery we want to open an already open table. - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - we verify that all log records of a group are about the same table, for debugging. mysql-test/r/maria-recovery.result: result mysql-test/t/maria-recovery-master.opt: crash is expected, core file would take room, stack trace would wake pushbuild up. mysql-test/t/maria-recovery.test: Test of recovery from mysql-test (it is already tested as unit tests in ma_test_recovery) (WL#4137) - test that, if recovery is made to start on an empty table it can replay the effects of committed and uncommitted statements (having only the committed ones in the end result). This should be the first test for someone writing code of new REDOs. - test that, if mysqld is crashed and recovery runs we have only committed statements in the end result. Crashes are done in different ways: flush nothing (so, uncommitted statement is often missing from the log => no rollback to do); flush pagecache (implicitely flushes log (WAL)) and flush log, both causes rollbacks; flush log can also flush state (state.records etc) to test recovery of the state (not tested well now as we repair the index anyway). - test of bug found by Jani and Monty in recovery (two REDO about the same page in one group). mysql-test/include/maria_empty_logs.inc: removes logs, to have a clean sheet for testing recovery. mysql-test/include/maria_make_snapshot.inc: copies a table to another directory, or back, or compares both (comparison is not implemented as physical comparison is impossible if an UNDO phase happened). mysql-test/include/maria_make_snapshot_for_comparison.inc: copies tables to another directory so that they can later serve as a comparison reference (they are the good tables, recovery should produce similar ones). mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: When we want to force recovery to start on old tables, we prepare old tables with this script: we put them in a spare directory. They are later copied back over mysqltest tables while mysqld is dead. We also need to copy back the control file, otherwise mysqld, in recovery, would start from the latest checkpoint: latest checkpoint plus old tables is not a recovery-possible scenario of course. mysql-test/include/maria_verify_recovery.inc: causes mysqld to crash, restores old tables if requested, lets recovery run, compares resulting tables with reference tables by using CHECKSUM TABLE. We don't do any sanity checks on page's LSN in resulting tables, yet.
2007-11-13 17:12:29 +01:00
trace_file= fopen("maria_recovery.trace", "a+");
#else
trace_file= NULL; /* no trace file for being fast */
#endif
tprint(trace_file, "TRACE of the last MARIA recovery from mysqld\n");
DBUG_ASSERT(maria_pagecache->inited);
res= maria_apply_log(LSN_IMPOSSIBLE, MARIA_LOG_APPLY, trace_file,
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
TRUE, TRUE, TRUE, &warnings_count);
if (!res)
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
{
if (warnings_count == 0)
tprint(trace_file, "SUCCESS\n");
else
{
tprint(trace_file, "DOUBTFUL (%u warnings, check previous output)\n",
warnings_count);
/*
We asked for execution of UNDOs, and skipped DDLs, so shouldn't get
any warnings.
*/
DBUG_ASSERT(0);
}
}
if (trace_file)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
fclose(trace_file);
maria_in_recovery= FALSE;
DBUG_RETURN(res);
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
/**
@brief Displays and/or applies the log
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
@param from_lsn LSN from which log reading/applying should start;
LSN_IMPOSSIBLE means "use last checkpoint"
@param apply how log records should be applied or not
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
@param trace_file trace file where progress/debug messages will go
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
@param skip_DDLs_arg Should DDL records (CREATE/RENAME/DROP/REPAIR)
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
be skipped by the REDO phase or not
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
@param take_checkpoints Should we take checkpoints or not.
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
@param[out] warnings_count Count of warnings will be put there
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
@todo This trace_file thing is primitive; soon we will make it similar to
ma_check_print_warning() etc, and a successful recovery does not need to
create a trace file. But for debugging now it is useful.
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
@return Operation status
@retval 0 OK
@retval !=0 Error
*/
int maria_apply_log(LSN from_lsn, enum maria_apply_log_way apply,
FILE *trace_file,
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
my_bool should_run_undo_phase, my_bool skip_DDLs_arg,
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
my_bool take_checkpoints, uint *warnings_count)
{
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
int error= 0;
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
uint uncommitted_trans;
ulonglong old_now;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
DBUG_ENTER("maria_apply_log");
DBUG_ASSERT(apply == MARIA_LOG_APPLY || !should_run_undo_phase);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
DBUG_ASSERT(!maria_multi_threaded);
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
warnings= 0;
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
/* checkpoints can happen only if TRNs have been built */
DBUG_ASSERT(should_run_undo_phase || !take_checkpoints);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
all_active_trans= (struct st_trn_for_recovery *)
my_malloc((SHORT_TRID_MAX + 1) * sizeof(struct st_trn_for_recovery),
MYF(MY_ZEROFILL));
all_tables= (struct st_table_for_recovery *)
my_malloc((SHARE_ID_MAX + 1) * sizeof(struct st_table_for_recovery),
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
MYF(MY_ZEROFILL));
if (!all_active_trans || !all_tables)
goto err;
WL#3071 - Maria checkpoint - serializing calls to flush_pagecache_blocks_int() on the same file to avoid known concurrency bugs - having that, we can now enable the background thread, as the flushes it does are now supposedly safe in concurrent situations. - new type of flush FLUSH_KEEP_LAZY: when the background checkpoint thread is flushing a packet of dirty pages between two checkpoints, it uses this flush type, indeed if a file is already being flushed by another thread it's smarter to move on to the next file than wait. - maria_checkpoint_frequency renamed to maria_checkpoint_interval. include/my_sys.h: new type of flushing for the page cache: FLUSH_KEEP_LAZY mysql-test/r/maria.result: result update mysys/mf_keycache.c: indentation. No FLUSH_KEEP_LAZY support in key cache. storage/maria/ha_maria.cc: maria_checkpoint_frequency was somehow a hidden part of the Checkpoint API and that was not good. Now we have checkpoint_interval, local to ha_maria.cc, which serves as container for the user-visible maria_checkpoint_interval global variable; setting it calls update_checkpoint_interval which passes the new value to ma_checkpoint_init(). There is no hiding anymore. By default, enable background thread which does checkpoints every 30 seconds, and dirty page flush in between. That thread takes a checkpoint when it ends, so no need for maria_hton_panic to take one. The | is | and not ||, because maria_panic() must always be called. frequency->interval. storage/maria/ma_checkpoint.c: Use FLUSH_KEEP_LAZY for background thread when it flushes packets of dirty pages between two checkpoints: it is smarter to move on to the next file than wait for it to have been completely flushed, which may take long. Comments about flush concurrency bugs moved from ma_pagecache.c. Removing out-of-date comment. frequency->interval. create_background_thread -> (interval>0). In ma_checkpoint_background(), some variables need to be preserved between iterations. storage/maria/ma_checkpoint.h: new prototype storage/maria/ma_pagecache.c: - concurrent calls of flush_pagecache_blocks_int() on the same file cause bugs (see @note in that function); we fix them by serializing in this situation. For that we use a global hash of (file, wqueue). When flush_pagecache_blocks_int() starts it looks into the hash, using the file as key. If not found, it inserts (file,wqueue) into the hash, flushes the file, and finally removes itself from the hash and wakes up any waiter in the queue. If found, it adds itself to the wqueue and waits. - As a by-product, we can remove changed_blocks_is_incomplete and replace it by scanning the hash, replace the sleep() by a queue wait. - new type of flush FLUSH_KEEP_LAZY: when flushing a file, if it's already being flushed by another thread (even partially), return immediately. storage/maria/ma_pagecache.h: In pagecache, a hash of files currently being flushed (i.e. there is a call to flush_pagecache_blocks_int() for them). storage/maria/ma_recovery.c: new prototype storage/maria/ma_test1.c: new prototype storage/maria/ma_test2.c: new prototype
2007-10-19 14:15:13 +02:00
if (take_checkpoints && ma_checkpoint_init(0))
WL#3071 - Maria checkpoint * Preparation for having a background checkpoint thread: frequency of checkpoint taken by that thread is now configurable by the user: global variable maria_checkpoint_frequency, in seconds, default 30 (checkpoint every 30th second); 0 means no checkpoints (and thus no background thread, thus no background flushing, that will probably only be used for testing). * Don't take checkpoints in Recovery if it didn't do anything significant; thus no checkpoint after a clean shutdown/restart. The only checkpoint which is never skipped is the one at shutdown. * fix for a test failure (after-merge fix) include/maria.h: new variable mysql-test/suite/rpl/r/rpl_row_flsh_tbls.result: result update mysql-test/suite/rpl/t/rpl_row_flsh_tbls.test: position update (=after merge fix, as this position was already changed into 5.1 and not merged here, causing test to fail) storage/maria/ha_maria.cc: Checkpoint's frequency is now configurable by the user: global variable maria_checkpoint_frequency. Changing it on the fly requires us to shutdown/restart the background checkpoint thread, as the loop done in that thread assumes a constant checkpoint interval. Default value is 30: a checkpoint every 30 seconds (yes, I know, physicists will remind that it should be named "period" then). ha_maria now asks for a background checkpoint thread when it starts, but this is still overruled (disabled) in ma_checkpoint_init(). storage/maria/ma_checkpoint.c: Checkpoint's frequency is now configurable by the user: background thread takes a checkpoint every maria_checkpoint_interval-th second. If that variable is 0, no checkpoints are taken. Note, I will enable the background thread only in a later changeset. storage/maria/ma_recovery.c: Don't take checkpoints at the end of the REDO phase and at the end of Recovery if Recovery didn't make anything significant (didn't open any tables, didn't rollback any transactions). With this, after a clean shutdown, Recovery shouldn't take any checkpoint, which makes starting faster (we save a few fsync()s of the log and control file).
2007-10-09 10:38:31 +02:00
goto err;
recovery_message_printed= REC_MSG_NONE;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
tracef= trace_file;
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
skip_DDLs= skip_DDLs_arg;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
if (from_lsn == LSN_IMPOSSIBLE)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
if (last_checkpoint_lsn == LSN_IMPOSSIBLE)
{
from_lsn= translog_first_theoretical_lsn();
/*
as far as we have not yet any checkpoint then the very first
log file should be present.
*/
if (unlikely((from_lsn == LSN_IMPOSSIBLE) ||
(from_lsn == LSN_ERROR)))
goto err;
}
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
else
{
from_lsn= parse_checkpoint_record(last_checkpoint_lsn);
if (from_lsn == LSN_ERROR)
goto err;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
now= my_getsystime();
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
if (run_redo_phase(from_lsn, apply))
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto err;
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
if ((uncommitted_trans=
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
end_of_redo_phase(should_run_undo_phase)) == (uint)-1)
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
goto err;
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
old_now= now;
now= my_getsystime();
if (recovery_message_printed == REC_MSG_REDO)
{
float phase_took= (now - old_now)/10000000.0;
WL#3071 Maria checkpoint, WL#3072 Maria recovery instead of fprintf(stderr) when a task (with no user connected) gets an error, use my_printf_error(). Flags ME_JUST_WARNING and ME_JUST_INFO added to my_error()/my_printf_error(), which pass it to my_message_sql() which is modified to call the appropriate sql_print_*(). This way recovery can signal its start and end with [Note] and not [ERROR] (but failure with [ERROR]). Recovery's detailed progress (percents etc) still uses stderr as they have to stay on one single line. sql_print_error() changed to use my_progname_short (nicer display). mysql-test-run.pl --gdb/--ddd does not run mysqld, because a breakpoint in mysql_parse is too late to debug startup problems; instead, dev should set the breakpoints it wants and then "run" ("r"). include/my_sys.h: new flags to tell error_handler_hook that this is not an error but an information or warning mysql-test/mysql-test-run.pl: when running with --gdb/--ddd to debug mysqld, breaking at mysql_parse is too late to debug startup problems; now, it does not run mysqld, does not set breakpoints, developer can set as early breakpoints as it wants and is responsible for typing "run" (or "r") mysys/my_init.c: set my_progname_short mysys/my_static.c: my_progname_short added sql/mysqld.cc: * my_message_sql() can now receive info or warning, not only error; this allows mysys to tell the user (or the error log if no user) about an info or warning. Used from Maria. * plugins (or engines like Maria) may want to call my_error(), so set up the error handler hook (my_message_sql) before initializing plugins; otherwise they get my_message_no_curses which is less integrated into mysqld (is just fputs()) * using my_progname_short instead of my_progname, in my_message_sql() (less space on screen) storage/maria/ma_checkpoint.c: fprintf(stderr) -> ma_message_no_user() storage/maria/ma_checkpoint.h: function for any Maria task, not connected to a user (example: checkpoint, recovery; soon could be deleted records purger) to report a message (calls my_printf_error() which, when inside ha_maria, leads to sql_print_*(), and when outside, leads to my_message_no_curses i.e. stderr). storage/maria/ma_recovery.c: To tell that recovery starts and ends we use ma_message_no_user() (sql_print_*() in practice). Detailed progress info still uses stderr as sql_print() cannot put several messages on one line. 071116 18:42:16 [Note] mysqld: Maria engine: starting recovery recovered pages: 0% 67% 100% (0.0 seconds); transactions to roll back: 1 0 (0.0 seconds); tables to flush: 1 0 (0.0 seconds); 071116 18:42:16 [Note] mysqld: Maria engine: recovery done storage/maria/maria_chk.c: my_progname_short moved to mysys storage/maria/maria_read_log.c: my_progname_short moved to mysys storage/myisam/myisamchk.c: my_progname_short moved to mysys
2007-11-16 17:09:51 +01:00
/*
Detailed progress info goes to stderr, because ma_message_no_user()
cannot put several messages on one line.
*/
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
procent_printed= 1;
fprintf(stderr, " (%.1f seconds); ", phase_took);
}
/**
REDO phase does not fill blocks' rec_lsn, so a checkpoint now would be
wrong: if a future recovery used it, the REDO phase would always
start from the checkpoint and never from before, wrongly skipping REDOs
(tested).
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
@todo fix this; pagecache_write() now can have a rec_lsn argument.
*/
#if 0
WL#3071 - Maria checkpoint * Preparation for having a background checkpoint thread: frequency of checkpoint taken by that thread is now configurable by the user: global variable maria_checkpoint_frequency, in seconds, default 30 (checkpoint every 30th second); 0 means no checkpoints (and thus no background thread, thus no background flushing, that will probably only be used for testing). * Don't take checkpoints in Recovery if it didn't do anything significant; thus no checkpoint after a clean shutdown/restart. The only checkpoint which is never skipped is the one at shutdown. * fix for a test failure (after-merge fix) include/maria.h: new variable mysql-test/suite/rpl/r/rpl_row_flsh_tbls.result: result update mysql-test/suite/rpl/t/rpl_row_flsh_tbls.test: position update (=after merge fix, as this position was already changed into 5.1 and not merged here, causing test to fail) storage/maria/ha_maria.cc: Checkpoint's frequency is now configurable by the user: global variable maria_checkpoint_frequency. Changing it on the fly requires us to shutdown/restart the background checkpoint thread, as the loop done in that thread assumes a constant checkpoint interval. Default value is 30: a checkpoint every 30 seconds (yes, I know, physicists will remind that it should be named "period" then). ha_maria now asks for a background checkpoint thread when it starts, but this is still overruled (disabled) in ma_checkpoint_init(). storage/maria/ma_checkpoint.c: Checkpoint's frequency is now configurable by the user: background thread takes a checkpoint every maria_checkpoint_interval-th second. If that variable is 0, no checkpoints are taken. Note, I will enable the background thread only in a later changeset. storage/maria/ma_recovery.c: Don't take checkpoints at the end of the REDO phase and at the end of Recovery if Recovery didn't make anything significant (didn't open any tables, didn't rollback any transactions). With this, after a clean shutdown, Recovery shouldn't take any checkpoint, which makes starting faster (we save a few fsync()s of the log and control file).
2007-10-09 10:38:31 +02:00
if (take_checkpoints && checkpoint_useful)
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
{
/*
We take a checkpoint as it can save future recovery work if we crash
during the UNDO phase. But we don't flush pages, as UNDOs will change
them again probably.
*/
WL#3071 - Maria checkpoint * Preparation for having a background checkpoint thread: frequency of checkpoint taken by that thread is now configurable by the user: global variable maria_checkpoint_frequency, in seconds, default 30 (checkpoint every 30th second); 0 means no checkpoints (and thus no background thread, thus no background flushing, that will probably only be used for testing). * Don't take checkpoints in Recovery if it didn't do anything significant; thus no checkpoint after a clean shutdown/restart. The only checkpoint which is never skipped is the one at shutdown. * fix for a test failure (after-merge fix) include/maria.h: new variable mysql-test/suite/rpl/r/rpl_row_flsh_tbls.result: result update mysql-test/suite/rpl/t/rpl_row_flsh_tbls.test: position update (=after merge fix, as this position was already changed into 5.1 and not merged here, causing test to fail) storage/maria/ha_maria.cc: Checkpoint's frequency is now configurable by the user: global variable maria_checkpoint_frequency. Changing it on the fly requires us to shutdown/restart the background checkpoint thread, as the loop done in that thread assumes a constant checkpoint interval. Default value is 30: a checkpoint every 30 seconds (yes, I know, physicists will remind that it should be named "period" then). ha_maria now asks for a background checkpoint thread when it starts, but this is still overruled (disabled) in ma_checkpoint_init(). storage/maria/ma_checkpoint.c: Checkpoint's frequency is now configurable by the user: background thread takes a checkpoint every maria_checkpoint_interval-th second. If that variable is 0, no checkpoints are taken. Note, I will enable the background thread only in a later changeset. storage/maria/ma_recovery.c: Don't take checkpoints at the end of the REDO phase and at the end of Recovery if Recovery didn't make anything significant (didn't open any tables, didn't rollback any transactions). With this, after a clean shutdown, Recovery shouldn't take any checkpoint, which makes starting faster (we save a few fsync()s of the log and control file).
2007-10-09 10:38:31 +02:00
if (ma_checkpoint_execute(CHECKPOINT_INDIRECT, FALSE))
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
goto err;
}
#endif
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
if (should_run_undo_phase)
{
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
if (run_undo_phase(uncommitted_trans))
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
goto err;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
else if (uncommitted_trans > 0)
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
{
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
tprint(tracef, "***WARNING: %u uncommitted transactions; some tables may"
" be left inconsistent!***\n", uncommitted_trans);
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
warnings++;
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
old_now= now;
now= my_getsystime();
if (recovery_message_printed == REC_MSG_UNDO)
{
float phase_took= (now - old_now)/10000000.0;
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
procent_printed= 1;
fprintf(stderr, " (%.1f seconds); ", phase_took);
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
/*
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
we don't use maria_panic() because it would maria_end(), and Recovery does
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
not want that (we want to keep some modules initialized for runtime).
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
*/
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
if (close_all_tables())
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto err;
old_now= now;
now= my_getsystime();
if (recovery_message_printed == REC_MSG_FLUSH)
{
float phase_took= (now - old_now)/10000000.0;
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
procent_printed= 1;
fprintf(stderr, " (%.1f seconds); ", phase_took);
}
WL#3071 - Maria checkpoint * Preparation for having a background checkpoint thread: frequency of checkpoint taken by that thread is now configurable by the user: global variable maria_checkpoint_frequency, in seconds, default 30 (checkpoint every 30th second); 0 means no checkpoints (and thus no background thread, thus no background flushing, that will probably only be used for testing). * Don't take checkpoints in Recovery if it didn't do anything significant; thus no checkpoint after a clean shutdown/restart. The only checkpoint which is never skipped is the one at shutdown. * fix for a test failure (after-merge fix) include/maria.h: new variable mysql-test/suite/rpl/r/rpl_row_flsh_tbls.result: result update mysql-test/suite/rpl/t/rpl_row_flsh_tbls.test: position update (=after merge fix, as this position was already changed into 5.1 and not merged here, causing test to fail) storage/maria/ha_maria.cc: Checkpoint's frequency is now configurable by the user: global variable maria_checkpoint_frequency. Changing it on the fly requires us to shutdown/restart the background checkpoint thread, as the loop done in that thread assumes a constant checkpoint interval. Default value is 30: a checkpoint every 30 seconds (yes, I know, physicists will remind that it should be named "period" then). ha_maria now asks for a background checkpoint thread when it starts, but this is still overruled (disabled) in ma_checkpoint_init(). storage/maria/ma_checkpoint.c: Checkpoint's frequency is now configurable by the user: background thread takes a checkpoint every maria_checkpoint_interval-th second. If that variable is 0, no checkpoints are taken. Note, I will enable the background thread only in a later changeset. storage/maria/ma_recovery.c: Don't take checkpoints at the end of the REDO phase and at the end of Recovery if Recovery didn't make anything significant (didn't open any tables, didn't rollback any transactions). With this, after a clean shutdown, Recovery shouldn't take any checkpoint, which makes starting faster (we save a few fsync()s of the log and control file).
2007-10-09 10:38:31 +02:00
if (take_checkpoints && checkpoint_useful)
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
{
/* No dirty pages, all tables are closed, no active transactions, save: */
if (ma_checkpoint_execute(CHECKPOINT_FULL, FALSE))
goto err;
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto end;
err:
error= 1;
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
tprint(tracef, "\nRecovery of tables with transaction logs FAILED\n");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
end:
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
hash_free(&all_dirty_pages);
bzero(&all_dirty_pages, sizeof(all_dirty_pages));
my_free(dirty_pages_pool, MYF(MY_ALLOW_ZERO_PTR));
dirty_pages_pool= NULL;
post-merge fixes, and fixes for some of the 16 compiler warnings found in pushbuild on sapsrv1. Some not fixed as not repeatable on my machine (32/64 bit issue?). Fixes for some test failures: - "maria-connect" now passes; - "maria": after fixing the obvious reasons for failures, the test went further and hit a more complex issues: difference in the output of EXPLAIN output; not fixed; - "ps_maria" still crashes in assertion mysqld: ha_maria.cc:1627: virtual int ha_maria::index_read(uchar*, const uchar*, uint, ha_rkey_function): Ass ertion `inited == INDEX' failed, as already observable in pushbuild. All this might just be due to an incomplete merge of MyISAM changes into Maria when 5.1 was last merged to mysql-maria. include/my_global.h: temporary fix until next merge of 5.1; without this it does not build mysql-test/r/maria-connect.result: position changed mysql-test/t/maria-connect.test: If one wants to use the binlog it has to ask for it. 1582 is not used for dup entry error anymore (it was in older 5.1). Size of first event in binlog was increased by 4 (when the new type of event "gap" was added). mysql-test/t/maria.test: 1582 not used anymore in this case storage/maria/ha_maria.cc: engine now has to say what binlogging it supports storage/maria/ma_blockrec.c: fix for compiler warnings ("comparison is always true" or "always false") storage/maria/ma_loghandler.c: fix for compiler warnings (comparing char* to uchar*) storage/maria/ma_packrec.c: fix for compiler warning (fix simply merged from MyISAM) storage/maria/ma_pagecache.c: info_check_pin() was not used so gave a compiler warning. storage/maria/ma_pagecache.h: fixing typo from the last 5.1->maria merge. storage/maria/ma_recovery.c: my_free() has a void* argument, so why cast. byte->uchar. storage/maria/ma_search.c: fix for compiler warning (fix simply merged from MyISAM) storage/maria/maria_read_log.c: gptr->uchar* storage/maria/trnman.c: probable fix for warning found in pushbuild (but not on my machine): storage/maria/trnman.c: 142 passing argument 6 of \u2018lf_hash_init\u2019 from incompatible pointer type on sapsrv1.
2007-07-26 17:51:49 +02:00
my_free(all_tables, MYF(MY_ALLOW_ZERO_PTR));
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
all_tables= NULL;
post-merge fixes, and fixes for some of the 16 compiler warnings found in pushbuild on sapsrv1. Some not fixed as not repeatable on my machine (32/64 bit issue?). Fixes for some test failures: - "maria-connect" now passes; - "maria": after fixing the obvious reasons for failures, the test went further and hit a more complex issues: difference in the output of EXPLAIN output; not fixed; - "ps_maria" still crashes in assertion mysqld: ha_maria.cc:1627: virtual int ha_maria::index_read(uchar*, const uchar*, uint, ha_rkey_function): Ass ertion `inited == INDEX' failed, as already observable in pushbuild. All this might just be due to an incomplete merge of MyISAM changes into Maria when 5.1 was last merged to mysql-maria. include/my_global.h: temporary fix until next merge of 5.1; without this it does not build mysql-test/r/maria-connect.result: position changed mysql-test/t/maria-connect.test: If one wants to use the binlog it has to ask for it. 1582 is not used for dup entry error anymore (it was in older 5.1). Size of first event in binlog was increased by 4 (when the new type of event "gap" was added). mysql-test/t/maria.test: 1582 not used anymore in this case storage/maria/ha_maria.cc: engine now has to say what binlogging it supports storage/maria/ma_blockrec.c: fix for compiler warnings ("comparison is always true" or "always false") storage/maria/ma_loghandler.c: fix for compiler warnings (comparing char* to uchar*) storage/maria/ma_packrec.c: fix for compiler warning (fix simply merged from MyISAM) storage/maria/ma_pagecache.c: info_check_pin() was not used so gave a compiler warning. storage/maria/ma_pagecache.h: fixing typo from the last 5.1->maria merge. storage/maria/ma_recovery.c: my_free() has a void* argument, so why cast. byte->uchar. storage/maria/ma_search.c: fix for compiler warning (fix simply merged from MyISAM) storage/maria/maria_read_log.c: gptr->uchar* storage/maria/trnman.c: probable fix for warning found in pushbuild (but not on my machine): storage/maria/trnman.c: 142 passing argument 6 of \u2018lf_hash_init\u2019 from incompatible pointer type on sapsrv1.
2007-07-26 17:51:49 +02:00
my_free(all_active_trans, MYF(MY_ALLOW_ZERO_PTR));
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
all_active_trans= NULL;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
my_free(log_record_buffer.str, MYF(MY_ALLOW_ZERO_PTR));
log_record_buffer.str= NULL;
log_record_buffer.length= 0;
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
ma_checkpoint_end();
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
*warnings_count= warnings;
if (recovery_message_printed != REC_MSG_NONE)
{
WL#3071 Maria checkpoint, WL#3072 Maria recovery instead of fprintf(stderr) when a task (with no user connected) gets an error, use my_printf_error(). Flags ME_JUST_WARNING and ME_JUST_INFO added to my_error()/my_printf_error(), which pass it to my_message_sql() which is modified to call the appropriate sql_print_*(). This way recovery can signal its start and end with [Note] and not [ERROR] (but failure with [ERROR]). Recovery's detailed progress (percents etc) still uses stderr as they have to stay on one single line. sql_print_error() changed to use my_progname_short (nicer display). mysql-test-run.pl --gdb/--ddd does not run mysqld, because a breakpoint in mysql_parse is too late to debug startup problems; instead, dev should set the breakpoints it wants and then "run" ("r"). include/my_sys.h: new flags to tell error_handler_hook that this is not an error but an information or warning mysql-test/mysql-test-run.pl: when running with --gdb/--ddd to debug mysqld, breaking at mysql_parse is too late to debug startup problems; now, it does not run mysqld, does not set breakpoints, developer can set as early breakpoints as it wants and is responsible for typing "run" (or "r") mysys/my_init.c: set my_progname_short mysys/my_static.c: my_progname_short added sql/mysqld.cc: * my_message_sql() can now receive info or warning, not only error; this allows mysys to tell the user (or the error log if no user) about an info or warning. Used from Maria. * plugins (or engines like Maria) may want to call my_error(), so set up the error handler hook (my_message_sql) before initializing plugins; otherwise they get my_message_no_curses which is less integrated into mysqld (is just fputs()) * using my_progname_short instead of my_progname, in my_message_sql() (less space on screen) storage/maria/ma_checkpoint.c: fprintf(stderr) -> ma_message_no_user() storage/maria/ma_checkpoint.h: function for any Maria task, not connected to a user (example: checkpoint, recovery; soon could be deleted records purger) to report a message (calls my_printf_error() which, when inside ha_maria, leads to sql_print_*(), and when outside, leads to my_message_no_curses i.e. stderr). storage/maria/ma_recovery.c: To tell that recovery starts and ends we use ma_message_no_user() (sql_print_*() in practice). Detailed progress info still uses stderr as sql_print() cannot put several messages on one line. 071116 18:42:16 [Note] mysqld: Maria engine: starting recovery recovered pages: 0% 67% 100% (0.0 seconds); transactions to roll back: 1 0 (0.0 seconds); tables to flush: 1 0 (0.0 seconds); 071116 18:42:16 [Note] mysqld: Maria engine: recovery done storage/maria/maria_chk.c: my_progname_short moved to mysys storage/maria/maria_read_log.c: my_progname_short moved to mysys storage/myisam/myisamchk.c: my_progname_short moved to mysys
2007-11-16 17:09:51 +01:00
fprintf(stderr, "\n");
if (error)
ma_message_no_user(0, "recovery failed");
else
ma_message_no_user(ME_JUST_INFO, "recovery done");
}
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
procent_printed= 0;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
/* we don't cleanly close tables if we hit some error (may corrupt them) */
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
DBUG_RETURN(error);
}
/* very basic info about the record's header */
static void display_record_position(const LOG_DESC *log_desc,
const TRANSLOG_HEADER_BUFFER *rec,
uint number)
{
/*
if number==0, we're going over records which we had already seen and which
form a group, so we indent below the group's end record
*/
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
tprint(tracef,
"%sRec#%u LSN (%lu,0x%lx) short_trid %u %s(num_type:%u) len %lu\n",
number ? "" : " ", number, LSN_IN_PARTS(rec->lsn),
rec->short_trid, log_desc->name, rec->type,
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
(ulong)rec->record_length);
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
static int display_and_apply_record(const LOG_DESC *log_desc,
const TRANSLOG_HEADER_BUFFER *rec)
{
int error;
if (log_desc->record_execute_in_redo_phase == NULL)
{
/* die on all not-yet-handled records :) */
DBUG_ASSERT("one more hook" == "to write");
return 1;
}
if ((error= (*log_desc->record_execute_in_redo_phase)(rec)))
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Got error %d when executing record\n", my_errno);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
return error;
}
prototype_redo_exec_hook(LONG_TRANSACTION_ID)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
uint16 sid= rec->short_trid;
TrID long_trid= all_active_trans[sid].long_trid;
/* abort group of this trn (must be of before a crash) */
LSN gslsn= all_active_trans[sid].group_start_lsn;
if (gslsn != LSN_IMPOSSIBLE)
{
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
tprint(tracef, "Group at LSN (%lu,0x%lx) short_trid %u incomplete\n",
LSN_IN_PARTS(gslsn), sid);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
all_active_trans[sid].group_start_lsn= LSN_IMPOSSIBLE;
}
if (long_trid != 0)
{
LSN ulsn= all_active_trans[sid].undo_lsn;
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
/*
If the first record of that transaction is after 'rec', it's probably
because that transaction was found in the checkpoint record, and then
it's ok, we can forget about that transaction (we'll meet it later
again in the REDO phase) and replace it with the one in 'rec'.
*/
if ((ulsn != LSN_IMPOSSIBLE) &&
(cmp_translog_addr(ulsn, rec->lsn) < 0))
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
char llbuf[22];
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
llstr(long_trid, llbuf);
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Found an old transaction long_trid %s short_trid %u"
" with same short id as this new transaction, and has neither"
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
" committed nor rollback (undo_lsn: (%lu,0x%lx))\n",
llbuf, sid, LSN_IN_PARTS(ulsn));
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto err;
}
}
long_trid= uint6korr(rec->header);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
new_transaction(sid, long_trid, LSN_IMPOSSIBLE, LSN_IMPOSSIBLE);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto end;
err:
ALERT_USER();
return 1;
end:
return 0;
}
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
static void new_transaction(uint16 sid, TrID long_id, LSN undo_lsn,
LSN first_undo_lsn)
{
char llbuf[22];
all_active_trans[sid].long_trid= long_id;
llstr(long_id, llbuf);
tprint(tracef, "Transaction long_trid %s short_trid %u starts\n",
llbuf, sid);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
all_active_trans[sid].undo_lsn= undo_lsn;
all_active_trans[sid].first_undo_lsn= first_undo_lsn;
set_if_bigger(max_long_trid, long_id);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
prototype_redo_exec_hook_dummy(CHECKPOINT)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
/* the only checkpoint we care about was found via control file, ignore */
return 0;
}
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
prototype_redo_exec_hook_dummy(INCOMPLETE_GROUP)
{
/* abortion was already made */
return 0;
}
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
prototype_redo_exec_hook(INCOMPLETE_LOG)
{
MARIA_HA *info;
if (skip_DDLs)
{
tprint(tracef, "we skip DDLs\n");
return 0;
}
if ((info= get_MARIA_HA_from_REDO_record(rec)) == NULL)
{
/* no such table, don't need to warn */
return 0;
}
/*
Example of what can go wrong when replaying DDLs:
CREATE TABLE t (logged); INSERT INTO t VALUES(1) (logged);
ALTER TABLE t ... which does
CREATE a temporary table #sql... (logged)
INSERT data from t into #sql... (not logged)
RENAME #sql TO t (logged)
Removing tables by hand and replaying the log will leave in the
end an empty table "t": missing records. If after the RENAME an INSERT
into t was done, that row had number 1 in its page, executing the
REDO_INSERT_ROW_HEAD on the recreated empty t will fail (assertion
failure in _ma_apply_redo_insert_row_head_or_tail(): new data page is
created whereas rownr is not 0).
So when the server disables logging for ALTER TABLE or CREATE SELECT, it
logs LOGREC_INCOMPLETE_LOG to warn maria_read_log and then the user.
Another issue is that replaying of DDLs is not correct enough to work if
there was a crash during a DDL (see comment in execution of
REDO_RENAME_TABLE ).
*/
tprint(tracef, "***WARNING: MySQL server currently logs no records"
" about insertion of data by ALTER TABLE and CREATE SELECT,"
" as they are not necessary for recovery;"
" present applying of log records may well not work.***\n");
warnings++;
return 0;
}
prototype_redo_exec_hook(REDO_CREATE_TABLE)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
File dfile= -1, kfile= -1;
char *linkname_ptr, filename[FN_REFLEN];
char *name, *ptr;
myf create_flag;
uint flags;
int error= 1, create_mode= O_RDWR | O_TRUNC;
MARIA_HA *info= NULL;
Remove SAFE_MODE for opt_range as it disables UPDATE to use keys REDO optimization (Bascily avoid moving blocks from/to pagecache) More command line arguments to maria_read_log Fixed recovery bug when recreating table sql/opt_range.cc: Remove SAFE_MODE for opt_range as it disables UPDATE to use keys storage/maria/ma_blockrec.c: REDO optimization Use new interface for pagecache_reads to avoid copying page buffers storage/maria/ma_loghandler.c: Patch from Sanja: - Added new parameter to translog_get_page to use direct links to pagecache - Changed scanner to be able to use direct links This avoids a lot of calls to bmove512() in page cache. storage/maria/ma_loghandler.h: Added direct link to pagecache objects storage/maria/ma_open.c: Added const to parameter Added missing braces storage/maria/ma_pagecache.c: From Sanja: - Added direct links to pagecache (from pagecache_read()) Dirrect link means that on pagecache_read we get back a pointer to the pagecache buffer From Monty: - Fixed arguments to init_page_cache to handle big page caches - Fixed compiler warnings - Replaced PAGECACHE_PAGE_LINK with PAGECACHE_BLOCK_LINK * to catch errors storage/maria/ma_pagecache.h: Changed block numbers from int to long to be able to handle big page caches Changed some PAGECACHE_PAGE_LINK to PAGECACHE_BLOCK_LINK storage/maria/ma_recovery.c: Fixed recovery bug when recreating table (table was kept open) Moved some variables to function start (portability) Added space to some print messages storage/maria/maria_chk.c: key_buffer_size -> page_buffer_size storage/maria/maria_def.h: Changed default page_buffer_size to 10M storage/maria/maria_read_log.c: Added more startup options: --version --undo (apply undo) --page_cache_size (to run with big cache sizes) --silent (to not get any output from --apply) storage/maria/unittest/ma_control_file-t.c: Fixed compiler warning storage/maria/unittest/ma_test_loghandler-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Added new argument to translog_init_scanner()
2007-09-27 13:18:28 +02:00
uint kfile_size_before_extension, keystart;
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
if (skip_DDLs)
{
tprint(tracef, "we skip DDLs\n");
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
return 0;
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to read record\n");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto end;
}
name= log_record_buffer.str;
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
/*
TRUNCATE TABLE and REPAIR USE_FRM call maria_create(), so below we can
find a REDO_CREATE_TABLE for a table which we have open, that's why we
need to look for any open instances and close them first.
*/
Remove SAFE_MODE for opt_range as it disables UPDATE to use keys REDO optimization (Bascily avoid moving blocks from/to pagecache) More command line arguments to maria_read_log Fixed recovery bug when recreating table sql/opt_range.cc: Remove SAFE_MODE for opt_range as it disables UPDATE to use keys storage/maria/ma_blockrec.c: REDO optimization Use new interface for pagecache_reads to avoid copying page buffers storage/maria/ma_loghandler.c: Patch from Sanja: - Added new parameter to translog_get_page to use direct links to pagecache - Changed scanner to be able to use direct links This avoids a lot of calls to bmove512() in page cache. storage/maria/ma_loghandler.h: Added direct link to pagecache objects storage/maria/ma_open.c: Added const to parameter Added missing braces storage/maria/ma_pagecache.c: From Sanja: - Added direct links to pagecache (from pagecache_read()) Dirrect link means that on pagecache_read we get back a pointer to the pagecache buffer From Monty: - Fixed arguments to init_page_cache to handle big page caches - Fixed compiler warnings - Replaced PAGECACHE_PAGE_LINK with PAGECACHE_BLOCK_LINK * to catch errors storage/maria/ma_pagecache.h: Changed block numbers from int to long to be able to handle big page caches Changed some PAGECACHE_PAGE_LINK to PAGECACHE_BLOCK_LINK storage/maria/ma_recovery.c: Fixed recovery bug when recreating table (table was kept open) Moved some variables to function start (portability) Added space to some print messages storage/maria/maria_chk.c: key_buffer_size -> page_buffer_size storage/maria/maria_def.h: Changed default page_buffer_size to 10M storage/maria/maria_read_log.c: Added more startup options: --version --undo (apply undo) --page_cache_size (to run with big cache sizes) --silent (to not get any output from --apply) storage/maria/unittest/ma_control_file-t.c: Fixed compiler warning storage/maria/unittest/ma_test_loghandler-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Added new argument to translog_init_scanner()
2007-09-27 13:18:28 +02:00
if (close_one_table(name, rec->lsn))
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Table '%s' got error %d on close\n", name, my_errno);
Remove SAFE_MODE for opt_range as it disables UPDATE to use keys REDO optimization (Bascily avoid moving blocks from/to pagecache) More command line arguments to maria_read_log Fixed recovery bug when recreating table sql/opt_range.cc: Remove SAFE_MODE for opt_range as it disables UPDATE to use keys storage/maria/ma_blockrec.c: REDO optimization Use new interface for pagecache_reads to avoid copying page buffers storage/maria/ma_loghandler.c: Patch from Sanja: - Added new parameter to translog_get_page to use direct links to pagecache - Changed scanner to be able to use direct links This avoids a lot of calls to bmove512() in page cache. storage/maria/ma_loghandler.h: Added direct link to pagecache objects storage/maria/ma_open.c: Added const to parameter Added missing braces storage/maria/ma_pagecache.c: From Sanja: - Added direct links to pagecache (from pagecache_read()) Dirrect link means that on pagecache_read we get back a pointer to the pagecache buffer From Monty: - Fixed arguments to init_page_cache to handle big page caches - Fixed compiler warnings - Replaced PAGECACHE_PAGE_LINK with PAGECACHE_BLOCK_LINK * to catch errors storage/maria/ma_pagecache.h: Changed block numbers from int to long to be able to handle big page caches Changed some PAGECACHE_PAGE_LINK to PAGECACHE_BLOCK_LINK storage/maria/ma_recovery.c: Fixed recovery bug when recreating table (table was kept open) Moved some variables to function start (portability) Added space to some print messages storage/maria/maria_chk.c: key_buffer_size -> page_buffer_size storage/maria/maria_def.h: Changed default page_buffer_size to 10M storage/maria/maria_read_log.c: Added more startup options: --version --undo (apply undo) --page_cache_size (to run with big cache sizes) --silent (to not get any output from --apply) storage/maria/unittest/ma_control_file-t.c: Fixed compiler warning storage/maria/unittest/ma_test_loghandler-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Added new argument to translog_init_scanner()
2007-09-27 13:18:28 +02:00
ALERT_USER();
goto end;
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
/* we try hard to get create_rename_lsn, to avoid mistakes if possible */
info= maria_open(name, O_RDONLY, HA_OPEN_FOR_REPAIR);
if (info)
{
MARIA_SHARE *share= info->s;
/* check that we're not already using it */
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
if (share->reopen != 1)
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Table '%s is already open (reopen=%u)\n",
name, share->reopen);
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
ALERT_USER();
goto end;
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
DBUG_ASSERT(share->now_transactional == share->base.born_transactional);
if (!share->base.born_transactional)
{
/*
could be that transactional table was later dropped, and a non-trans
one was renamed to its name, thus create_rename_lsn is 0 and should
not be trusted.
*/
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
tprint(tracef, "Table '%s' is not transactional, ignoring creation\n",
name);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
ALERT_USER();
error= 0;
goto end;
}
if (cmp_translog_addr(share->state.create_rename_lsn, rec->lsn) >= 0)
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
tprint(tracef, "Table '%s' has create_rename_lsn (%lu,0x%lx) more "
"recent than record, ignoring creation",
name, LSN_IN_PARTS(share->state.create_rename_lsn));
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
error= 0;
goto end;
}
if (maria_is_crashed(info))
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Table '%s' is crashed, can't recreate it\n", name);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
ALERT_USER();
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
goto end;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
maria_close(info);
info= NULL;
}
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
else /* one or two files absent, or header corrupted... */
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
tprint(tracef, "Table '%s' can't be opened, probably does not exist\n",
name);
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
/* if does not exist, or is older, overwrite it */
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
ptr= name + strlen(name) + 1;
if ((flags= ptr[0] ? HA_DONT_TOUCH_DATA : 0))
tprint(tracef, ", we will only touch index file");
ptr++;
kfile_size_before_extension= uint2korr(ptr);
ptr+= 2;
keystart= uint2korr(ptr);
ptr+= 2;
uchar *kfile_header= ptr;
ptr+= kfile_size_before_extension;
/* set create_rename_lsn (for maria_read_log to be idempotent) */
lsn_store(kfile_header + sizeof(info->s->state.header) + 2, rec->lsn);
/* we also set is_of_horizon, like maria_create() does */
lsn_store(kfile_header + sizeof(info->s->state.header) + 2 + LSN_STORE_SIZE,
rec->lsn);
uchar *data_file_name= ptr;
ptr+= strlen(data_file_name) + 1;
uchar *index_file_name= ptr;
ptr+= strlen(index_file_name) + 1;
/** @todo handle symlinks */
if (data_file_name[0] || index_file_name[0])
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Table '%s' DATA|INDEX DIRECTORY clauses are not handled\n",
name);
goto end;
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
fn_format(filename, name, "", MARIA_NAME_IEXT,
(MY_UNPACK_FILENAME |
(flags & HA_DONT_TOUCH_DATA) ? MY_RETURN_REAL_PATH : 0) |
MY_APPEND_EXT);
linkname_ptr= NULL;
create_flag= MY_DELETE_OLD;
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
tprint(tracef, "Table '%s' creating as '%s'", name, filename);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if ((kfile= my_create_with_symlink(linkname_ptr, filename, 0, create_mode,
MYF(MY_WME|create_flag))) < 0)
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to create index file\n");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto end;
}
if (my_pwrite(kfile, kfile_header,
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
kfile_size_before_extension, 0, MYF(MY_NABP|MY_WME)) ||
my_chsize(kfile, keystart, 0, MYF(MY_WME)))
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to write to index file\n");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto end;
}
if (!(flags & HA_DONT_TOUCH_DATA))
{
fn_format(filename,name,"", MARIA_NAME_DEXT,
MY_UNPACK_FILENAME | MY_APPEND_EXT);
linkname_ptr= NULL;
create_flag=MY_DELETE_OLD;
if (((dfile=
my_create_with_symlink(linkname_ptr, filename, 0, create_mode,
MYF(MY_WME | create_flag))) < 0) ||
my_close(dfile, MYF(MY_WME)))
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to create data file\n");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto end;
}
/*
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
we now have an empty data file. To be able to
_ma_initialize_data_file() we need some pieces of the share to be
correctly filled. So we just open the table (fortunately, an empty
data file does not preclude this).
*/
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if (((info= maria_open(name, O_RDONLY, 0)) == NULL) ||
_ma_initialize_data_file(info->s, info->dfile.file))
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to open new table or write to data file\n");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto end;
}
}
error= 0;
end:
tprint(tracef, "\n");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if (kfile >= 0)
error|= my_close(kfile, MYF(MY_WME));
if (info != NULL)
error|= maria_close(info);
return error;
}
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
prototype_redo_exec_hook(REDO_RENAME_TABLE)
{
char *old_name, *new_name;
int error= 1;
MARIA_HA *info= NULL;
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
if (skip_DDLs)
{
tprint(tracef, "we skip DDLs\n");
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
return 0;
}
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to read record\n");
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
goto end;
}
old_name= log_record_buffer.str;
new_name= old_name + strlen(old_name) + 1;
tprint(tracef, "Table '%s' to rename to '%s'; old-name table ", old_name,
new_name);
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
/*
Here is why we skip CREATE/DROP/RENAME when doing a recovery from
ha_maria (whereas we do when called from maria_read_log). Consider:
CREATE TABLE t;
RENAME TABLE t to u;
DROP TABLE u;
RENAME TABLE v to u; # crash between index rename and data rename.
And do a Recovery (not removing tables beforehand).
Recovery replays CREATE, then RENAME: the maria_open("t") works,
maria_open("u") does not (no data file) so table "u" is considered
inexistent and so maria_rename() is done which overwrites u's index file,
which is lost. Ok, the data file (v.MAD) is still available, but only a
REPAIR USE_FRM can rebuild the index, which is unsafe and downtime.
So it is preferrable to not execute RENAME, and leave the "mess" of files,
rather than possibly destroy a file. DBA will manually rename files.
A safe recovery method would probably require checking the existence of
the index file and of the data file separately (not via maria_open()), and
maybe also to store a create_rename_lsn in the data file too
For now, all we risk is to leave the mess (half-renamed files) left by the
crash. We however sync files and directories at each file rename. The SQL
layer is anyway not crash-safe for DDLs (except the repartioning-related
ones).
We replay DDLs in maria_read_log to be able to recreate tables from
scratch. It means that "maria_read_log -a" should not be used on a
database which just crashed during a DDL. And also ALTER TABLE does not
log insertions of records into the temporary table, so replaying may
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
fail (grep for INCOMPLETE_LOG in files).
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
*/
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
info= maria_open(old_name, O_RDONLY, HA_OPEN_FOR_REPAIR);
if (info)
{
MARIA_SHARE *share= info->s;
if (!share->base.born_transactional)
{
tprint(tracef, ", is not transactional, ignoring renaming\n");
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
ALERT_USER();
error= 0;
goto end;
}
if (cmp_translog_addr(share->state.create_rename_lsn, rec->lsn) >= 0)
{
tprint(tracef, ", has create_rename_lsn (%lu,0x%lx) more recent than"
" record, ignoring renaming",
LSN_IN_PARTS(share->state.create_rename_lsn));
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
error= 0;
goto end;
}
if (maria_is_crashed(info))
{
tprint(tracef, ", is crashed, can't rename it");
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
ALERT_USER();
goto end;
}
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
if (close_one_table(info->s->open_file_name, rec->lsn) ||
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
maria_close(info))
goto end;
info= NULL;
tprint(tracef, ", is ok for renaming; new-name table ");
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
}
else /* one or two files absent, or header corrupted... */
{
tprint(tracef, ", can't be opened, probably does not exist");
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
error= 0;
goto end;
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
}
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
/*
We must also check the create_rename_lsn of the 'new_name' table if it
exists: otherwise we may, with our rename which overwrites, destroy
another table. For example:
CREATE TABLE t;
RENAME t to u;
DROP TABLE u;
RENAME v to u; # v is an old table, its creation/insertions not in log
And start executing the log (without removing tables beforehand): creates
t, renames it to u (if not testing create_rename_lsn) thus overwriting
old-named v, drops u, and we are stuck, we have lost data.
*/
info= maria_open(new_name, O_RDONLY, HA_OPEN_FOR_REPAIR);
if (info)
{
MARIA_SHARE *share= info->s;
/* We should not have open instances on this table. */
if (share->reopen != 1)
{
tprint(tracef, ", is already open (reopen=%u)\n", share->reopen);
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
ALERT_USER();
goto end;
}
if (!share->base.born_transactional)
{
tprint(tracef, ", is not transactional, ignoring renaming\n");
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
ALERT_USER();
goto drop;
}
if (cmp_translog_addr(share->state.create_rename_lsn, rec->lsn) >= 0)
{
tprint(tracef, ", has create_rename_lsn (%lu,0x%lx) more recent than"
" record, ignoring renaming",
LSN_IN_PARTS(share->state.create_rename_lsn));
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
/*
We have to drop the old_name table. Consider:
CREATE TABLE t;
CREATE TABLE v;
RENAME TABLE t to u;
DROP TABLE u;
RENAME TABLE v to u;
and apply the log without removing tables beforehand. t will be
created, v too; in REDO_RENAME u will be more recent, but we still
have to drop t otherwise it stays.
*/
goto drop;
}
if (maria_is_crashed(info))
{
tprint(tracef, ", is crashed, can't rename it");
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
ALERT_USER();
goto end;
}
if (maria_close(info))
goto end;
info= NULL;
/* abnormal situation */
tprint(tracef, ", exists but is older than record, can't rename it");
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
goto end;
}
else /* one or two files absent, or header corrupted... */
tprint(tracef, ", can't be opened, probably does not exist");
tprint(tracef, ", renaming '%s'", old_name);
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
if (maria_rename(old_name, new_name))
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to rename table\n");
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
goto end;
}
info= maria_open(new_name, O_RDONLY, 0);
if (info == NULL)
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to open renamed table\n");
goto end;
}
if (_ma_update_create_rename_lsn(info->s, rec->lsn, TRUE))
goto end;
if (maria_close(info))
goto end;
info= NULL;
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
error= 0;
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
goto end;
drop:
tprint(tracef, ", only dropping '%s'", old_name);
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
if (maria_delete_table(old_name))
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to drop table\n");
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
goto end;
}
error= 0;
goto end;
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
end:
tprint(tracef, "\n");
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
if (info != NULL)
error|= maria_close(info);
return error;
}
/*
The record may come from REPAIR, ALTER TABLE ENABLE KEYS, OPTIMIZE.
*/
prototype_redo_exec_hook(REDO_REPAIR_TABLE)
{
int error= 1;
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
MARIA_HA *info;
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
HA_CHECK param;
char *name;
uint quick_repair;
DBUG_ENTER("exec_REDO_LOGREC_REDO_REPAIR_TABLE");
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
if (skip_DDLs)
{
/*
REPAIR is not exactly a DDL, but it manipulates files without logging
insertions into them.
*/
tprint(tracef, "we skip DDLs\n");
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
DBUG_RETURN(0);
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
}
if ((info= get_MARIA_HA_from_REDO_record(rec)) == NULL)
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
DBUG_RETURN(0);
/*
Otherwise, the mapping is newer than the table, and our record is newer
than the mapping, so we can repair.
*/
tprint(tracef, " repairing...\n");
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
maria_chk_init(&param);
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
param.isam_file_name= name= info->s->open_file_name;
param.testflag= uint4korr(rec->header + FILEID_STORE_SIZE);
param.tmpdir= maria_tmpdir;
DBUG_ASSERT(maria_tmpdir);
info->s->state.key_map= uint8korr(rec->header + FILEID_STORE_SIZE + 4);
quick_repair= param.testflag & T_QUICK;
if (param.testflag & T_REP_PARALLEL)
{
if (maria_repair_parallel(&param, info, name, quick_repair))
goto end;
}
else if (param.testflag & T_REP_BY_SORT)
{
if (maria_repair_by_sort(&param, info, name, quick_repair))
goto end;
}
else if (maria_repair(&param, info, name, quick_repair))
goto end;
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
if (_ma_update_create_rename_lsn(info->s, rec->lsn, TRUE))
goto end;
error= 0;
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
end:
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
DBUG_RETURN(error);
}
prototype_redo_exec_hook(REDO_DROP_TABLE)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
char *name;
int error= 1;
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
MARIA_HA *info;
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
if (skip_DDLs)
{
tprint(tracef, "we skip DDLs\n");
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
return 0;
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to read record\n");
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
return 1;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
name= log_record_buffer.str;
tprint(tracef, "Table '%s'", name);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
info= maria_open(name, O_RDONLY, HA_OPEN_FOR_REPAIR);
if (info)
{
MARIA_SHARE *share= info->s;
if (!share->base.born_transactional)
{
tprint(tracef, ", is not transactional, ignoring removal\n");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
ALERT_USER();
error= 0;
goto end;
}
if (cmp_translog_addr(share->state.create_rename_lsn, rec->lsn) >= 0)
{
tprint(tracef, ", has create_rename_lsn (%lu,0x%lx) more recent than"
" record, ignoring removal",
LSN_IN_PARTS(share->state.create_rename_lsn));
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
error= 0;
goto end;
}
if (maria_is_crashed(info))
{
tprint(tracef, ", is crashed, can't drop it");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
ALERT_USER();
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
goto end;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
if (close_one_table(info->s->open_file_name, rec->lsn) ||
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
maria_close(info))
goto end;
info= NULL;
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
/* if it is older, or its header is corrupted, drop it */
tprint(tracef, ", dropping '%s'", name);
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
if (maria_delete_table(name))
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to drop table\n");
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
goto end;
}
}
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
else /* one or two files absent, or header corrupted... */
tprint(tracef,", can't be opened, probably does not exist");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
error= 0;
end:
tprint(tracef, "\n");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if (info != NULL)
error|= maria_close(info);
return error;
}
prototype_redo_exec_hook(FILE_ID)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
uint16 sid;
int error= 1;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
const char *name;
MARIA_HA *info;
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
DBUG_ENTER("exec_REDO_LOGREC_FILE_ID");
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
if (cmp_translog_addr(rec->lsn, checkpoint_start) < 0)
{
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
/*
If that mapping was still true at checkpoint time, it was found in
checkpoint record, no need to recreate it. If that mapping had ended at
checkpoint time (table was closed or repaired), a flush and force
happened and so mapping is not needed.
*/
tprint(tracef, "ignoring because before checkpoint\n");
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
DBUG_RETURN(0);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to read record\n");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto end;
}
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
sid= fileid_korr(log_record_buffer.str);
info= all_tables[sid].info;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if (info != NULL)
{
tprint(tracef, " Closing table '%s'\n", info->s->open_file_name);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
prepare_table_for_close(info, rec->lsn);
if (maria_close(info))
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to close table\n");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto end;
}
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
all_tables[sid].info= NULL;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
name= log_record_buffer.str + FILEID_STORE_SIZE;
if (new_table(sid, name, -1, -1, rec->lsn))
goto end;
error= 0;
end:
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
DBUG_RETURN(error);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
static int new_table(uint16 sid, const char *name,
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
File org_kfile, File org_dfile,
LSN lsn_of_file_id)
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
{
/*
-1 (skip table): close table and return 0;
1 (error): close table and return 1;
0 (success): leave table open and return 0.
*/
int error= 1;
MARIA_HA *info;
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
MARIA_SHARE *share;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
WL#3071 - Maria checkpoint * Preparation for having a background checkpoint thread: frequency of checkpoint taken by that thread is now configurable by the user: global variable maria_checkpoint_frequency, in seconds, default 30 (checkpoint every 30th second); 0 means no checkpoints (and thus no background thread, thus no background flushing, that will probably only be used for testing). * Don't take checkpoints in Recovery if it didn't do anything significant; thus no checkpoint after a clean shutdown/restart. The only checkpoint which is never skipped is the one at shutdown. * fix for a test failure (after-merge fix) include/maria.h: new variable mysql-test/suite/rpl/r/rpl_row_flsh_tbls.result: result update mysql-test/suite/rpl/t/rpl_row_flsh_tbls.test: position update (=after merge fix, as this position was already changed into 5.1 and not merged here, causing test to fail) storage/maria/ha_maria.cc: Checkpoint's frequency is now configurable by the user: global variable maria_checkpoint_frequency. Changing it on the fly requires us to shutdown/restart the background checkpoint thread, as the loop done in that thread assumes a constant checkpoint interval. Default value is 30: a checkpoint every 30 seconds (yes, I know, physicists will remind that it should be named "period" then). ha_maria now asks for a background checkpoint thread when it starts, but this is still overruled (disabled) in ma_checkpoint_init(). storage/maria/ma_checkpoint.c: Checkpoint's frequency is now configurable by the user: background thread takes a checkpoint every maria_checkpoint_interval-th second. If that variable is 0, no checkpoints are taken. Note, I will enable the background thread only in a later changeset. storage/maria/ma_recovery.c: Don't take checkpoints at the end of the REDO phase and at the end of Recovery if Recovery didn't make anything significant (didn't open any tables, didn't rollback any transactions). With this, after a clean shutdown, Recovery shouldn't take any checkpoint, which makes starting faster (we save a few fsync()s of the log and control file).
2007-10-09 10:38:31 +02:00
checkpoint_useful= TRUE;
if ((name == NULL) || (name[0] == 0))
{
/*
we didn't use DBUG_ASSERT() because such record corruption could
silently pass in the "info == NULL" test below.
*/
tprint(tracef, ", record is corrupted");
info= NULL;
goto end;
}
tprint(tracef, "Table '%s', id %u", name, sid);
info= maria_open(name, O_RDWR, HA_OPEN_FOR_REPAIR);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if (info == NULL)
{
tprint(tracef, ", is absent (must have been dropped later?)"
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
" or its header is so corrupted that we cannot open it;"
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
" we skip it");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
error= 0;
goto end;
}
if (maria_is_crashed(info))
{
tprint(tracef, "Table is crashed, can't apply log records to it\n");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto end;
}
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
share= info->s;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
/* check that we're not already using it */
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
if (share->reopen != 1)
{
tprint(tracef, ", is already open (reopen=%u)\n", share->reopen);
* WL#4137 Maria- Framework for testing recovery in mysql-test-run See test maria-recovery.test for a model; all include scripts have an "API" section at start if they do take parameters from outside. * Fixing bug reported by Jani and Monty (when two REDOs about the same page in one group, see ma_blockrec.c). * Fixing small bugs in recovery mysql-test/include/wait_until_connected_again.inc: be sure to enter the loop (the previous query by the caller may not have failed: it could be query; mysqladmin shutdown; call this script). mysql-test/lib/mtr_process.pl: * Through the "expect" file a test can tell mtr that a server crash is expected. What the file contains is irrelevant. Now if its last line starts with "wait", mtr will wait before restarting (it will wait for the last line to not start with "wait"). This is for tests which need to mangle files under the feet of a dead mysqld. * Remove "expect" file before restarting; otherwise there could be a race condition: tests sees server restarted, does something, writes an "expect" file, and then mtr removes that file, then test kills mysqld, and then mtr will never restart it. storage/maria/ma_blockrec.c: - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - fixing bug in applying of REDO_PURGE_BLOCKS in recovery: page_range sometimes has TAIL_BIT set, need to turn it down to know the real page range. - Both bugs are covered in maria-recovery.test storage/maria/ma_checkpoint.c: Capability to, in debug builds only, do some special operations (flush all bitmap and data pages, flush state, flush log) and crash mysqld, to later test recovery. Driven by some --debug=d, symbols. storage/maria/ma_open.c: debugging info storage/maria/ma_pagecache.c: Now that we can _ma_unpin_all_pages() during the REDO phase to set page's LSN, the assertion needs to be relaxed. storage/maria/ma_recovery.c: - open trace file in append mode (useful when a test triggers several recoveries, we see them all). - fixing wrong error detection, it's possible that during recovery we want to open an already open table. - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - we verify that all log records of a group are about the same table, for debugging. mysql-test/r/maria-recovery.result: result mysql-test/t/maria-recovery-master.opt: crash is expected, core file would take room, stack trace would wake pushbuild up. mysql-test/t/maria-recovery.test: Test of recovery from mysql-test (it is already tested as unit tests in ma_test_recovery) (WL#4137) - test that, if recovery is made to start on an empty table it can replay the effects of committed and uncommitted statements (having only the committed ones in the end result). This should be the first test for someone writing code of new REDOs. - test that, if mysqld is crashed and recovery runs we have only committed statements in the end result. Crashes are done in different ways: flush nothing (so, uncommitted statement is often missing from the log => no rollback to do); flush pagecache (implicitely flushes log (WAL)) and flush log, both causes rollbacks; flush log can also flush state (state.records etc) to test recovery of the state (not tested well now as we repair the index anyway). - test of bug found by Jani and Monty in recovery (two REDO about the same page in one group). mysql-test/include/maria_empty_logs.inc: removes logs, to have a clean sheet for testing recovery. mysql-test/include/maria_make_snapshot.inc: copies a table to another directory, or back, or compares both (comparison is not implemented as physical comparison is impossible if an UNDO phase happened). mysql-test/include/maria_make_snapshot_for_comparison.inc: copies tables to another directory so that they can later serve as a comparison reference (they are the good tables, recovery should produce similar ones). mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: When we want to force recovery to start on old tables, we prepare old tables with this script: we put them in a spare directory. They are later copied back over mysqltest tables while mysqld is dead. We also need to copy back the control file, otherwise mysqld, in recovery, would start from the latest checkpoint: latest checkpoint plus old tables is not a recovery-possible scenario of course. mysql-test/include/maria_verify_recovery.inc: causes mysqld to crash, restores old tables if requested, lets recovery run, compares resulting tables with reference tables by using CHECKSUM TABLE. We don't do any sanity checks on page's LSN in resulting tables, yet.
2007-11-13 17:12:29 +01:00
/*
It could be that we have in the log
FILE_ID(t1,10) ... (t1 was flushed) ... FILE_ID(t1,12);
*/
if (close_one_table(share->open_file_name, lsn_of_file_id))
goto end;
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
DBUG_ASSERT(share->now_transactional == share->base.born_transactional);
if (!share->base.born_transactional)
{
tprint(tracef, ", is not transactional\n");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
ALERT_USER();
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
error= -1;
goto end;
}
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
if (cmp_translog_addr(lsn_of_file_id, share->state.create_rename_lsn) <= 0)
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
{
tprint(tracef, ", has create_rename_lsn (%lu,0x%lx) more recent than"
" LOGREC_FILE_ID's LSN (%lu,0x%lx), ignoring open request",
LSN_IN_PARTS(share->state.create_rename_lsn),
LSN_IN_PARTS(lsn_of_file_id));
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
error= -1;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto end;
}
/* don't log any records for this work */
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
_ma_tmp_disable_logging_for_table(info, FALSE);
* WL#4137 Maria- Framework for testing recovery in mysql-test-run See test maria-recovery.test for a model; all include scripts have an "API" section at start if they do take parameters from outside. * Fixing bug reported by Jani and Monty (when two REDOs about the same page in one group, see ma_blockrec.c). * Fixing small bugs in recovery mysql-test/include/wait_until_connected_again.inc: be sure to enter the loop (the previous query by the caller may not have failed: it could be query; mysqladmin shutdown; call this script). mysql-test/lib/mtr_process.pl: * Through the "expect" file a test can tell mtr that a server crash is expected. What the file contains is irrelevant. Now if its last line starts with "wait", mtr will wait before restarting (it will wait for the last line to not start with "wait"). This is for tests which need to mangle files under the feet of a dead mysqld. * Remove "expect" file before restarting; otherwise there could be a race condition: tests sees server restarted, does something, writes an "expect" file, and then mtr removes that file, then test kills mysqld, and then mtr will never restart it. storage/maria/ma_blockrec.c: - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - fixing bug in applying of REDO_PURGE_BLOCKS in recovery: page_range sometimes has TAIL_BIT set, need to turn it down to know the real page range. - Both bugs are covered in maria-recovery.test storage/maria/ma_checkpoint.c: Capability to, in debug builds only, do some special operations (flush all bitmap and data pages, flush state, flush log) and crash mysqld, to later test recovery. Driven by some --debug=d, symbols. storage/maria/ma_open.c: debugging info storage/maria/ma_pagecache.c: Now that we can _ma_unpin_all_pages() during the REDO phase to set page's LSN, the assertion needs to be relaxed. storage/maria/ma_recovery.c: - open trace file in append mode (useful when a test triggers several recoveries, we see them all). - fixing wrong error detection, it's possible that during recovery we want to open an already open table. - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - we verify that all log records of a group are about the same table, for debugging. mysql-test/r/maria-recovery.result: result mysql-test/t/maria-recovery-master.opt: crash is expected, core file would take room, stack trace would wake pushbuild up. mysql-test/t/maria-recovery.test: Test of recovery from mysql-test (it is already tested as unit tests in ma_test_recovery) (WL#4137) - test that, if recovery is made to start on an empty table it can replay the effects of committed and uncommitted statements (having only the committed ones in the end result). This should be the first test for someone writing code of new REDOs. - test that, if mysqld is crashed and recovery runs we have only committed statements in the end result. Crashes are done in different ways: flush nothing (so, uncommitted statement is often missing from the log => no rollback to do); flush pagecache (implicitely flushes log (WAL)) and flush log, both causes rollbacks; flush log can also flush state (state.records etc) to test recovery of the state (not tested well now as we repair the index anyway). - test of bug found by Jani and Monty in recovery (two REDO about the same page in one group). mysql-test/include/maria_empty_logs.inc: removes logs, to have a clean sheet for testing recovery. mysql-test/include/maria_make_snapshot.inc: copies a table to another directory, or back, or compares both (comparison is not implemented as physical comparison is impossible if an UNDO phase happened). mysql-test/include/maria_make_snapshot_for_comparison.inc: copies tables to another directory so that they can later serve as a comparison reference (they are the good tables, recovery should produce similar ones). mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: When we want to force recovery to start on old tables, we prepare old tables with this script: we put them in a spare directory. They are later copied back over mysqltest tables while mysqld is dead. We also need to copy back the control file, otherwise mysqld, in recovery, would start from the latest checkpoint: latest checkpoint plus old tables is not a recovery-possible scenario of course. mysql-test/include/maria_verify_recovery.inc: causes mysqld to crash, restores old tables if requested, lets recovery run, compares resulting tables with reference tables by using CHECKSUM TABLE. We don't do any sanity checks on page's LSN in resulting tables, yet.
2007-11-13 17:12:29 +01:00
/* _ma_unpin_all_pages() reads info->trn: */
info->trn= &dummy_transaction_object;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
/* execution of some REDO records relies on data_file_length */
my_off_t dfile_len= my_seek(info->dfile.file, 0, SEEK_END, MYF(MY_WME));
my_off_t kfile_len= my_seek(info->s->kfile.file, 0, SEEK_END, MYF(MY_WME));
if ((dfile_len == MY_FILEPOS_ERROR) ||
(kfile_len == MY_FILEPOS_ERROR))
{
tprint(tracef, ", length unknown\n");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto end;
}
if (share->state.state.data_file_length != dfile_len)
{
tprint(tracef, ", has wrong state.data_file_length (fixing it)");
share->state.state.data_file_length= dfile_len;
}
if (share->state.state.key_file_length != kfile_len)
{
tprint(tracef, ", has wrong state.key_file_length (fixing it)");
share->state.state.key_file_length= kfile_len;
}
Fixed repair_by_sort to work with BLOCK_RECORD Fixed bugs in undo logging Fixed bug where head block was split before min_row_length (caused Maria to believe row was crashed on read) Reserved place for reference-transid on key pages (for packing of transids) ALTER TABLE and INSERT ... SELECT now uses fast creation of index Known bugs: ma_test_recovery fails because of a bug in redo handling when log is cut directly after a redo (Guilhem knows how to fix) ma_test_recovery.excepted is not totally correct, because of the above bug mysqld sometimes fails to restart; Fails with error "end_of_redo_phase: Assertion `long_trid != 0' failed"; Guilhem to investigate include/maria.h: Prototype changes Added current_filepos to st_maria_sort_info mysql-test/r/maria.result: Updated results that changes as alter table and insert ... select now uses fast creation of index mysys/mf_iocache.c: Reset variable to gurard against double invocation storage/maria/ma_bitmap.c: Added _ma_bitmap_reset_cache() (needed for repair) storage/maria/ma_blockrec.c: Simplify code More initial allocations Fixed bug where head block was split before min_row_length (caused Maria to believe row was crashed on read) storage/maria/ma_blockrec.h: Moved TRANSID_SIZE to maria_def.h Added prototype for new functions storage/maria/ma_check.c: Simplicy code Fixed repair_by_sort to work with BLOCK_RECORD - When using BLOCK_RECORD or UNPACK create new Maria handle - Use common initializer function - Align code with maria_repair() Made some changes to maria_repair_parallel() to use common initializer function Removed ASK_MONTY section by fixing noted problem storage/maria/ma_close.c: Moved check for readonly to _ma_state_info_write() storage/maria/ma_key_recover.c: Use different log entries if key root changes or not. This fixed some bugs when tree grows storage/maria/ma_key_recover.h: Added keynr to st_msg_to_write_hook_for_undo_key storage/maria/ma_loghandler.c: Added INIT_LOGREC_UNDO_KEY_INSERT_WITH_ROOT storage/maria/ma_loghandler.h: Added INIT_LOGREC_UNDO_KEY_INSERT_WITH_ROOT storage/maria/ma_open.c: Added TRANSID to all key pages (for future compressing of trans id's) For compressed records, alloc a bit bigger buffer to avoid valgrind warnings If table is opened readonly, don't update state storage/maria/ma_packrec.c: Allocate bigger array for bit unpacking to avoid valgrind errors storage/maria/ma_recovery.c: Added UNDO_KEY_INSERT_WITH_ROOT & UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_sort.c: More logging storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Update results Note that this is not complete becasue of a bug in recovery storage/maria/ma_test_recovery: Removed recreation of index (not needed when we have redo for index pages) storage/maria/maria_chk.c: When using flag --read-only, don't update status for files When using --unpack, don't use REPAIR_BY_SORT if other repair option is given Enable repair_by_sort for BLOCK records Removed not needed newline at start of --describe storage/maria/maria_def.h: Support for TRANSID_SIZE to key pages storage/maria/maria_read_log.c: renamed --only-display to --display-only
2007-11-28 20:38:30 +01:00
if ((dfile_len % share->block_size) || (kfile_len % share->block_size))
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
tprint(tracef, ", has too short last page\n");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
/* Recovery will fix this, no error */
ALERT_USER();
}
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
/*
This LSN serves in this situation; assume log is:
FILE_ID(6->"t2") REDO_INSERT(6) FILE_ID(6->"t1") CHECKPOINT(6->"t1")
then crash, checkpoint record is parsed and opens "t1" with id 6; assume
REDO phase starts from the REDO_INSERT above: it will wrongly try to
update a page of "t1". With this LSN below, REDO_INSERT can realize the
mapping is newer than itself, and not execute.
Same example is possible with UNDO_INSERT (update of the state).
*/
info->s->lsn_of_file_id= lsn_of_file_id;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
all_tables[sid].info= info;
all_tables[sid].org_kfile= org_kfile;
all_tables[sid].org_dfile= org_dfile;
/*
We don't set info->s->id, it would be useless (no logging in REDO phase);
if you change that, know that some records in REDO phase call
_ma_update_create_rename_lsn() which resets info->s->id.
*/
tprint(tracef, ", opened");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
error= 0;
end:
tprint(tracef, "\n");
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
if (error)
{
if (info != NULL)
maria_close(info);
if (error == -1)
error= 0;
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
return error;
}
prototype_redo_exec_hook(REDO_INSERT_ROW_HEAD)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
int error= 1;
post-merge fixes, and fixes for some of the 16 compiler warnings found in pushbuild on sapsrv1. Some not fixed as not repeatable on my machine (32/64 bit issue?). Fixes for some test failures: - "maria-connect" now passes; - "maria": after fixing the obvious reasons for failures, the test went further and hit a more complex issues: difference in the output of EXPLAIN output; not fixed; - "ps_maria" still crashes in assertion mysqld: ha_maria.cc:1627: virtual int ha_maria::index_read(uchar*, const uchar*, uint, ha_rkey_function): Ass ertion `inited == INDEX' failed, as already observable in pushbuild. All this might just be due to an incomplete merge of MyISAM changes into Maria when 5.1 was last merged to mysql-maria. include/my_global.h: temporary fix until next merge of 5.1; without this it does not build mysql-test/r/maria-connect.result: position changed mysql-test/t/maria-connect.test: If one wants to use the binlog it has to ask for it. 1582 is not used for dup entry error anymore (it was in older 5.1). Size of first event in binlog was increased by 4 (when the new type of event "gap" was added). mysql-test/t/maria.test: 1582 not used anymore in this case storage/maria/ha_maria.cc: engine now has to say what binlogging it supports storage/maria/ma_blockrec.c: fix for compiler warnings ("comparison is always true" or "always false") storage/maria/ma_loghandler.c: fix for compiler warnings (comparing char* to uchar*) storage/maria/ma_packrec.c: fix for compiler warning (fix simply merged from MyISAM) storage/maria/ma_pagecache.c: info_check_pin() was not used so gave a compiler warning. storage/maria/ma_pagecache.h: fixing typo from the last 5.1->maria merge. storage/maria/ma_recovery.c: my_free() has a void* argument, so why cast. byte->uchar. storage/maria/ma_search.c: fix for compiler warning (fix simply merged from MyISAM) storage/maria/maria_read_log.c: gptr->uchar* storage/maria/trnman.c: probable fix for warning found in pushbuild (but not on my machine): storage/maria/trnman.c: 142 passing argument 6 of \u2018lf_hash_init\u2019 from incompatible pointer type on sapsrv1.
2007-07-26 17:51:49 +02:00
uchar *buff= NULL;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
MARIA_HA *info= get_MARIA_HA_from_REDO_record(rec);
if (info == NULL)
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
{
/*
Table was skipped at open time (because later dropped/renamed, not
transactional, or create_rename_lsn newer than LOGREC_FILE_ID); it is
not an error.
*/
return 0;
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
/*
If REDO's LSN is > page's LSN (read from disk), we are going to modify the
page and change its LSN. The normal runtime code stores the UNDO's LSN
into the page. Here storing the REDO's LSN (rec->lsn) would work
(we are not writing to the log here, so don't have to "flush up to UNDO's
LSN"). But in a test scenario where we do updates at runtime, then remove
tables, apply the log and check that this results in the same table as at
runtime, putting the same LSN as runtime had done will decrease
differences. So we use the UNDO's LSN which is current_group_end_lsn.
*/
enlarge_buffer(rec);
if (log_record_buffer.str == NULL)
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to read allocate buffer for record\n");
goto end;
}
if (translog_read_record(rec->lsn, 0, rec->record_length,
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
log_record_buffer.str, NULL) !=
rec->record_length)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to read record\n");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto end;
}
buff= log_record_buffer.str;
if (_ma_apply_redo_insert_row_head_or_tail(info, current_group_end_lsn,
HEAD_PAGE,
buff + FILEID_STORE_SIZE,
buff +
FILEID_STORE_SIZE +
PAGE_STORE_SIZE +
DIRPOS_STORE_SIZE,
rec->record_length -
(FILEID_STORE_SIZE +
PAGE_STORE_SIZE +
DIRPOS_STORE_SIZE)))
goto end;
error= 0;
end:
return error;
}
prototype_redo_exec_hook(REDO_INSERT_ROW_TAIL)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
int error= 1;
uchar *buff;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
MARIA_HA *info= get_MARIA_HA_from_REDO_record(rec);
if (info == NULL)
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
return 0;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to read record\n");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto end;
}
buff= log_record_buffer.str;
if (_ma_apply_redo_insert_row_head_or_tail(info, current_group_end_lsn,
TAIL_PAGE,
buff + FILEID_STORE_SIZE,
buff +
FILEID_STORE_SIZE +
PAGE_STORE_SIZE +
DIRPOS_STORE_SIZE,
rec->record_length -
(FILEID_STORE_SIZE +
PAGE_STORE_SIZE +
DIRPOS_STORE_SIZE)))
goto end;
error= 0;
end:
return error;
}
Merge some changes from sql directory in 5.1 tree Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added redo_free_head_or_tail() & redo_insert_row_blobs() Added uuid to control file maria_checks now verifies that not used part of bitmap is 0 REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL Fixes problem when trying to read block outside of file during REDO include/my_global.h: STACK_DIRECTION is already set by configure mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Test shrinking of VARCHAR mysys/my_realloc.c: Fixed indentation mysys/safemalloc.c: Fixed indentation sql/filesort.cc: Removed some casts sql/mysqld.cc: Added missing setting of myisam_stats_method_str sql/uniques.cc: Removed some casts storage/maria/ma_bitmap.c: Added printing of bitmap (for debugging) Renamed _ma_print_bitmap() -> _ma_print_bitmap_changes() Added _ma_set_full_page_bits() Fixed bug in ma_bitmap_find_new_place() (affecting updates) when using big files storage/maria/ma_blockrec.c: Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added code to fix some cases where redo when using blobs didn't produce idenital .MAD files as normal usage REDO_FREE_ROW_BLOCKS doesn't anymore change pages; We only mark things free in bitmap Remove TAIL and filler extents from REDO_FREE_BLOCKS log entry. (Fixed some asserts) REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Delete tails in update. (Fixed bug when doing update that shrinks blob/varchar length) Fixed bug when doing insert in block outside of file size. Added redo_free_head_or_tail() & redo_insert_row_blobs() Added pagecache_unlock_by_link() when read fails. Much more comments, DBUG and ASSERT entries storage/maria/ma_blockrec.h: Prototypes of new functions Define of SUB_RANGE_SIZE & BLOCK_FILLER_SIZE storage/maria/ma_check.c: Verify that not used part of bitmap is 0 storage/maria/ma_control_file.c: Added uuid to control file storage/maria/ma_loghandler.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_loghandler.h: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_pagecache.c: If we write full block, remove error flag for block. (Fixes problem when trying to read block outside of file) storage/maria/ma_recovery.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_test1.c: Allow option after 'b' to be compatible with ma_test2 (This is just to simplify test scripts like ma_test_recovery) storage/maria/ma_test2.c: Default size of blob is now 1000 instead of 1 storage/maria/ma_test_all.sh: Added test for bigger blobs storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Added test for bigger blobs
2007-10-19 23:24:22 +02:00
prototype_redo_exec_hook(REDO_INSERT_ROW_BLOBS)
{
int error= 1;
uchar *buff;
MARIA_HA *info= get_MARIA_HA_from_REDO_record(rec);
if (info == NULL)
return 0;
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to read record\n");
Merge some changes from sql directory in 5.1 tree Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added redo_free_head_or_tail() & redo_insert_row_blobs() Added uuid to control file maria_checks now verifies that not used part of bitmap is 0 REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL Fixes problem when trying to read block outside of file during REDO include/my_global.h: STACK_DIRECTION is already set by configure mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Test shrinking of VARCHAR mysys/my_realloc.c: Fixed indentation mysys/safemalloc.c: Fixed indentation sql/filesort.cc: Removed some casts sql/mysqld.cc: Added missing setting of myisam_stats_method_str sql/uniques.cc: Removed some casts storage/maria/ma_bitmap.c: Added printing of bitmap (for debugging) Renamed _ma_print_bitmap() -> _ma_print_bitmap_changes() Added _ma_set_full_page_bits() Fixed bug in ma_bitmap_find_new_place() (affecting updates) when using big files storage/maria/ma_blockrec.c: Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added code to fix some cases where redo when using blobs didn't produce idenital .MAD files as normal usage REDO_FREE_ROW_BLOCKS doesn't anymore change pages; We only mark things free in bitmap Remove TAIL and filler extents from REDO_FREE_BLOCKS log entry. (Fixed some asserts) REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Delete tails in update. (Fixed bug when doing update that shrinks blob/varchar length) Fixed bug when doing insert in block outside of file size. Added redo_free_head_or_tail() & redo_insert_row_blobs() Added pagecache_unlock_by_link() when read fails. Much more comments, DBUG and ASSERT entries storage/maria/ma_blockrec.h: Prototypes of new functions Define of SUB_RANGE_SIZE & BLOCK_FILLER_SIZE storage/maria/ma_check.c: Verify that not used part of bitmap is 0 storage/maria/ma_control_file.c: Added uuid to control file storage/maria/ma_loghandler.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_loghandler.h: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_pagecache.c: If we write full block, remove error flag for block. (Fixes problem when trying to read block outside of file) storage/maria/ma_recovery.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_test1.c: Allow option after 'b' to be compatible with ma_test2 (This is just to simplify test scripts like ma_test_recovery) storage/maria/ma_test2.c: Default size of blob is now 1000 instead of 1 storage/maria/ma_test_all.sh: Added test for bigger blobs storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Added test for bigger blobs
2007-10-19 23:24:22 +02:00
goto end;
}
buff= log_record_buffer.str;
if (_ma_apply_redo_insert_row_blobs(info, current_group_end_lsn,
buff + FILEID_STORE_SIZE))
goto end;
error= 0;
end:
return error;
}
prototype_redo_exec_hook(REDO_PURGE_ROW_HEAD)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
int error= 1;
MARIA_HA *info= get_MARIA_HA_from_REDO_record(rec);
if (info == NULL)
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
return 0;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if (_ma_apply_redo_purge_row_head_or_tail(info, current_group_end_lsn,
HEAD_PAGE,
rec->header + FILEID_STORE_SIZE))
goto end;
error= 0;
end:
return error;
}
prototype_redo_exec_hook(REDO_PURGE_ROW_TAIL)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
int error= 1;
MARIA_HA *info= get_MARIA_HA_from_REDO_record(rec);
if (info == NULL)
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
return 0;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if (_ma_apply_redo_purge_row_head_or_tail(info, current_group_end_lsn,
TAIL_PAGE,
rec->header + FILEID_STORE_SIZE))
goto end;
error= 0;
end:
return error;
}
Merge some changes from sql directory in 5.1 tree Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added redo_free_head_or_tail() & redo_insert_row_blobs() Added uuid to control file maria_checks now verifies that not used part of bitmap is 0 REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL Fixes problem when trying to read block outside of file during REDO include/my_global.h: STACK_DIRECTION is already set by configure mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Test shrinking of VARCHAR mysys/my_realloc.c: Fixed indentation mysys/safemalloc.c: Fixed indentation sql/filesort.cc: Removed some casts sql/mysqld.cc: Added missing setting of myisam_stats_method_str sql/uniques.cc: Removed some casts storage/maria/ma_bitmap.c: Added printing of bitmap (for debugging) Renamed _ma_print_bitmap() -> _ma_print_bitmap_changes() Added _ma_set_full_page_bits() Fixed bug in ma_bitmap_find_new_place() (affecting updates) when using big files storage/maria/ma_blockrec.c: Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added code to fix some cases where redo when using blobs didn't produce idenital .MAD files as normal usage REDO_FREE_ROW_BLOCKS doesn't anymore change pages; We only mark things free in bitmap Remove TAIL and filler extents from REDO_FREE_BLOCKS log entry. (Fixed some asserts) REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Delete tails in update. (Fixed bug when doing update that shrinks blob/varchar length) Fixed bug when doing insert in block outside of file size. Added redo_free_head_or_tail() & redo_insert_row_blobs() Added pagecache_unlock_by_link() when read fails. Much more comments, DBUG and ASSERT entries storage/maria/ma_blockrec.h: Prototypes of new functions Define of SUB_RANGE_SIZE & BLOCK_FILLER_SIZE storage/maria/ma_check.c: Verify that not used part of bitmap is 0 storage/maria/ma_control_file.c: Added uuid to control file storage/maria/ma_loghandler.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_loghandler.h: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_pagecache.c: If we write full block, remove error flag for block. (Fixes problem when trying to read block outside of file) storage/maria/ma_recovery.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_test1.c: Allow option after 'b' to be compatible with ma_test2 (This is just to simplify test scripts like ma_test_recovery) storage/maria/ma_test2.c: Default size of blob is now 1000 instead of 1 storage/maria/ma_test_all.sh: Added test for bigger blobs storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Added test for bigger blobs
2007-10-19 23:24:22 +02:00
prototype_redo_exec_hook(REDO_FREE_BLOCKS)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
int error= 1;
uchar *buff;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
MARIA_HA *info= get_MARIA_HA_from_REDO_record(rec);
if (info == NULL)
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
return 0;
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to read record\n");
goto end;
}
buff= log_record_buffer.str;
Merge some changes from sql directory in 5.1 tree Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added redo_free_head_or_tail() & redo_insert_row_blobs() Added uuid to control file maria_checks now verifies that not used part of bitmap is 0 REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL Fixes problem when trying to read block outside of file during REDO include/my_global.h: STACK_DIRECTION is already set by configure mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Test shrinking of VARCHAR mysys/my_realloc.c: Fixed indentation mysys/safemalloc.c: Fixed indentation sql/filesort.cc: Removed some casts sql/mysqld.cc: Added missing setting of myisam_stats_method_str sql/uniques.cc: Removed some casts storage/maria/ma_bitmap.c: Added printing of bitmap (for debugging) Renamed _ma_print_bitmap() -> _ma_print_bitmap_changes() Added _ma_set_full_page_bits() Fixed bug in ma_bitmap_find_new_place() (affecting updates) when using big files storage/maria/ma_blockrec.c: Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added code to fix some cases where redo when using blobs didn't produce idenital .MAD files as normal usage REDO_FREE_ROW_BLOCKS doesn't anymore change pages; We only mark things free in bitmap Remove TAIL and filler extents from REDO_FREE_BLOCKS log entry. (Fixed some asserts) REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Delete tails in update. (Fixed bug when doing update that shrinks blob/varchar length) Fixed bug when doing insert in block outside of file size. Added redo_free_head_or_tail() & redo_insert_row_blobs() Added pagecache_unlock_by_link() when read fails. Much more comments, DBUG and ASSERT entries storage/maria/ma_blockrec.h: Prototypes of new functions Define of SUB_RANGE_SIZE & BLOCK_FILLER_SIZE storage/maria/ma_check.c: Verify that not used part of bitmap is 0 storage/maria/ma_control_file.c: Added uuid to control file storage/maria/ma_loghandler.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_loghandler.h: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_pagecache.c: If we write full block, remove error flag for block. (Fixes problem when trying to read block outside of file) storage/maria/ma_recovery.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_test1.c: Allow option after 'b' to be compatible with ma_test2 (This is just to simplify test scripts like ma_test_recovery) storage/maria/ma_test2.c: Default size of blob is now 1000 instead of 1 storage/maria/ma_test_all.sh: Added test for bigger blobs storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Added test for bigger blobs
2007-10-19 23:24:22 +02:00
if (_ma_apply_redo_free_blocks(info, current_group_end_lsn,
buff + FILEID_STORE_SIZE))
goto end;
error= 0;
end:
return error;
}
prototype_redo_exec_hook(REDO_FREE_HEAD_OR_TAIL)
{
int error= 1;
MARIA_HA *info= get_MARIA_HA_from_REDO_record(rec);
if (info == NULL)
return 0;
if (_ma_apply_redo_free_head_or_tail(info, current_group_end_lsn,
rec->header + FILEID_STORE_SIZE))
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto end;
error= 0;
end:
return error;
}
prototype_redo_exec_hook(REDO_DELETE_ALL)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
int error= 1;
MARIA_HA *info= get_MARIA_HA_from_REDO_record(rec);
if (info == NULL)
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
return 0;
tprint(tracef, " deleting all %lu rows\n",
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
(ulong)info->s->state.state.records);
if (maria_delete_all_rows(info))
goto end;
error= 0;
end:
return error;
}
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
prototype_redo_exec_hook(REDO_INDEX)
{
int error= 1;
MARIA_HA *info= get_MARIA_HA_from_REDO_record(rec);
if (info == NULL)
return 0;
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to read record\n");
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
goto end;
}
if (_ma_apply_redo_index(info, current_group_end_lsn,
log_record_buffer.str + FILEID_STORE_SIZE,
rec->record_length - FILEID_STORE_SIZE))
goto end;
error= 0;
end:
return error;
}
prototype_redo_exec_hook(REDO_INDEX_NEW_PAGE)
{
int error= 1;
MARIA_HA *info= get_MARIA_HA_from_REDO_record(rec);
if (info == NULL)
return 0;
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to read record\n");
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
goto end;
}
if (_ma_apply_redo_index_new_page(info, current_group_end_lsn,
log_record_buffer.str + FILEID_STORE_SIZE,
rec->record_length - FILEID_STORE_SIZE))
goto end;
error= 0;
end:
return error;
}
prototype_redo_exec_hook(REDO_INDEX_FREE_PAGE)
{
int error= 1;
MARIA_HA *info= get_MARIA_HA_from_REDO_record(rec);
if (info == NULL)
return 0;
if (_ma_apply_redo_index_free_page(info, current_group_end_lsn,
rec->header + FILEID_STORE_SIZE))
goto end;
error= 0;
end:
return error;
}
#define set_undo_lsn_for_active_trans(TRID, LSN) do { \
all_active_trans[TRID].undo_lsn= LSN; \
if (all_active_trans[TRID].first_undo_lsn == LSN_IMPOSSIBLE) \
all_active_trans[TRID].first_undo_lsn= LSN; } while (0)
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
prototype_redo_exec_hook(UNDO_ROW_INSERT)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
MARIA_HA *info= get_MARIA_HA_from_UNDO_record(rec);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
MARIA_SHARE *share;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if (info == NULL)
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
return 0;
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
share= info->s;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
set_undo_lsn_for_active_trans(rec->short_trid, rec->lsn);
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
if (cmp_translog_addr(rec->lsn, share->state.is_of_horizon) >= 0)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
* WL#4137 Maria- Framework for testing recovery in mysql-test-run See test maria-recovery.test for a model; all include scripts have an "API" section at start if they do take parameters from outside. * Fixing bug reported by Jani and Monty (when two REDOs about the same page in one group, see ma_blockrec.c). * Fixing small bugs in recovery mysql-test/include/wait_until_connected_again.inc: be sure to enter the loop (the previous query by the caller may not have failed: it could be query; mysqladmin shutdown; call this script). mysql-test/lib/mtr_process.pl: * Through the "expect" file a test can tell mtr that a server crash is expected. What the file contains is irrelevant. Now if its last line starts with "wait", mtr will wait before restarting (it will wait for the last line to not start with "wait"). This is for tests which need to mangle files under the feet of a dead mysqld. * Remove "expect" file before restarting; otherwise there could be a race condition: tests sees server restarted, does something, writes an "expect" file, and then mtr removes that file, then test kills mysqld, and then mtr will never restart it. storage/maria/ma_blockrec.c: - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - fixing bug in applying of REDO_PURGE_BLOCKS in recovery: page_range sometimes has TAIL_BIT set, need to turn it down to know the real page range. - Both bugs are covered in maria-recovery.test storage/maria/ma_checkpoint.c: Capability to, in debug builds only, do some special operations (flush all bitmap and data pages, flush state, flush log) and crash mysqld, to later test recovery. Driven by some --debug=d, symbols. storage/maria/ma_open.c: debugging info storage/maria/ma_pagecache.c: Now that we can _ma_unpin_all_pages() during the REDO phase to set page's LSN, the assertion needs to be relaxed. storage/maria/ma_recovery.c: - open trace file in append mode (useful when a test triggers several recoveries, we see them all). - fixing wrong error detection, it's possible that during recovery we want to open an already open table. - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - we verify that all log records of a group are about the same table, for debugging. mysql-test/r/maria-recovery.result: result mysql-test/t/maria-recovery-master.opt: crash is expected, core file would take room, stack trace would wake pushbuild up. mysql-test/t/maria-recovery.test: Test of recovery from mysql-test (it is already tested as unit tests in ma_test_recovery) (WL#4137) - test that, if recovery is made to start on an empty table it can replay the effects of committed and uncommitted statements (having only the committed ones in the end result). This should be the first test for someone writing code of new REDOs. - test that, if mysqld is crashed and recovery runs we have only committed statements in the end result. Crashes are done in different ways: flush nothing (so, uncommitted statement is often missing from the log => no rollback to do); flush pagecache (implicitely flushes log (WAL)) and flush log, both causes rollbacks; flush log can also flush state (state.records etc) to test recovery of the state (not tested well now as we repair the index anyway). - test of bug found by Jani and Monty in recovery (two REDO about the same page in one group). mysql-test/include/maria_empty_logs.inc: removes logs, to have a clean sheet for testing recovery. mysql-test/include/maria_make_snapshot.inc: copies a table to another directory, or back, or compares both (comparison is not implemented as physical comparison is impossible if an UNDO phase happened). mysql-test/include/maria_make_snapshot_for_comparison.inc: copies tables to another directory so that they can later serve as a comparison reference (they are the good tables, recovery should produce similar ones). mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: When we want to force recovery to start on old tables, we prepare old tables with this script: we put them in a spare directory. They are later copied back over mysqltest tables while mysqld is dead. We also need to copy back the control file, otherwise mysqld, in recovery, would start from the latest checkpoint: latest checkpoint plus old tables is not a recovery-possible scenario of course. mysql-test/include/maria_verify_recovery.inc: causes mysqld to crash, restores old tables if requested, lets recovery run, compares resulting tables with reference tables by using CHECKSUM TABLE. We don't do any sanity checks on page's LSN in resulting tables, yet.
2007-11-13 17:12:29 +01:00
tprint(tracef, " state has LSN (%lu,0x%lx) older than record, updating"
" rows' count\n", LSN_IN_PARTS(share->state.is_of_horizon));
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
share->state.state.records++;
if (share->calc_checksum)
{
uchar buff[HA_CHECKSUM_STORE_SIZE];
if (translog_read_record(rec->lsn, LSN_STORE_SIZE + FILEID_STORE_SIZE +
PAGE_STORE_SIZE + DIRPOS_STORE_SIZE,
HA_CHECKSUM_STORE_SIZE, buff, NULL) !=
HA_CHECKSUM_STORE_SIZE)
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to read record\n");
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
return 1;
}
share->state.state.checksum+= ha_checksum_korr(buff);
}
/**
@todo some bits below will rather be set when executing UNDOs related
to keys
*/
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
info->s->state.changed|= STATE_CHANGED | STATE_NOT_ANALYZED;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
tprint(tracef, " rows' count %lu\n", (ulong)info->s->state.state.records);
* WL#4137 Maria- Framework for testing recovery in mysql-test-run See test maria-recovery.test for a model; all include scripts have an "API" section at start if they do take parameters from outside. * Fixing bug reported by Jani and Monty (when two REDOs about the same page in one group, see ma_blockrec.c). * Fixing small bugs in recovery mysql-test/include/wait_until_connected_again.inc: be sure to enter the loop (the previous query by the caller may not have failed: it could be query; mysqladmin shutdown; call this script). mysql-test/lib/mtr_process.pl: * Through the "expect" file a test can tell mtr that a server crash is expected. What the file contains is irrelevant. Now if its last line starts with "wait", mtr will wait before restarting (it will wait for the last line to not start with "wait"). This is for tests which need to mangle files under the feet of a dead mysqld. * Remove "expect" file before restarting; otherwise there could be a race condition: tests sees server restarted, does something, writes an "expect" file, and then mtr removes that file, then test kills mysqld, and then mtr will never restart it. storage/maria/ma_blockrec.c: - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - fixing bug in applying of REDO_PURGE_BLOCKS in recovery: page_range sometimes has TAIL_BIT set, need to turn it down to know the real page range. - Both bugs are covered in maria-recovery.test storage/maria/ma_checkpoint.c: Capability to, in debug builds only, do some special operations (flush all bitmap and data pages, flush state, flush log) and crash mysqld, to later test recovery. Driven by some --debug=d, symbols. storage/maria/ma_open.c: debugging info storage/maria/ma_pagecache.c: Now that we can _ma_unpin_all_pages() during the REDO phase to set page's LSN, the assertion needs to be relaxed. storage/maria/ma_recovery.c: - open trace file in append mode (useful when a test triggers several recoveries, we see them all). - fixing wrong error detection, it's possible that during recovery we want to open an already open table. - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - we verify that all log records of a group are about the same table, for debugging. mysql-test/r/maria-recovery.result: result mysql-test/t/maria-recovery-master.opt: crash is expected, core file would take room, stack trace would wake pushbuild up. mysql-test/t/maria-recovery.test: Test of recovery from mysql-test (it is already tested as unit tests in ma_test_recovery) (WL#4137) - test that, if recovery is made to start on an empty table it can replay the effects of committed and uncommitted statements (having only the committed ones in the end result). This should be the first test for someone writing code of new REDOs. - test that, if mysqld is crashed and recovery runs we have only committed statements in the end result. Crashes are done in different ways: flush nothing (so, uncommitted statement is often missing from the log => no rollback to do); flush pagecache (implicitely flushes log (WAL)) and flush log, both causes rollbacks; flush log can also flush state (state.records etc) to test recovery of the state (not tested well now as we repair the index anyway). - test of bug found by Jani and Monty in recovery (two REDO about the same page in one group). mysql-test/include/maria_empty_logs.inc: removes logs, to have a clean sheet for testing recovery. mysql-test/include/maria_make_snapshot.inc: copies a table to another directory, or back, or compares both (comparison is not implemented as physical comparison is impossible if an UNDO phase happened). mysql-test/include/maria_make_snapshot_for_comparison.inc: copies tables to another directory so that they can later serve as a comparison reference (they are the good tables, recovery should produce similar ones). mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: When we want to force recovery to start on old tables, we prepare old tables with this script: we put them in a spare directory. They are later copied back over mysqltest tables while mysqld is dead. We also need to copy back the control file, otherwise mysqld, in recovery, would start from the latest checkpoint: latest checkpoint plus old tables is not a recovery-possible scenario of course. mysql-test/include/maria_verify_recovery.inc: causes mysqld to crash, restores old tables if requested, lets recovery run, compares resulting tables with reference tables by using CHECKSUM TABLE. We don't do any sanity checks on page's LSN in resulting tables, yet.
2007-11-13 17:12:29 +01:00
/* Unpin all pages, stamp them with UNDO's LSN */
_ma_unpin_all_pages(info, rec->lsn);
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
return 0;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
prototype_redo_exec_hook(UNDO_ROW_DELETE)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
MARIA_HA *info= get_MARIA_HA_from_UNDO_record(rec);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
MARIA_SHARE *share;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if (info == NULL)
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
return 0;
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
share= info->s;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
set_undo_lsn_for_active_trans(rec->short_trid, rec->lsn);
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
if (cmp_translog_addr(rec->lsn, share->state.is_of_horizon) >= 0)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
tprint(tracef, " state older than record\n");
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
share->state.state.records--;
if (share->calc_checksum)
{
uchar buff[HA_CHECKSUM_STORE_SIZE];
if (translog_read_record(rec->lsn, LSN_STORE_SIZE + FILEID_STORE_SIZE +
PAGE_STORE_SIZE + DIRPOS_STORE_SIZE,
HA_CHECKSUM_STORE_SIZE, buff, NULL) !=
HA_CHECKSUM_STORE_SIZE)
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to read record\n");
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
return 1;
}
share->state.state.checksum+= ha_checksum_korr(buff);
}
Fixes for redo/undo logging of key pages New extendable format for maria_log_control file Fixed some compiler warnings include/maria.h: Added maria_disable_logging() and maria_enable_logging() mysql-test/include/maria_verify_recovery.inc: Updated tests now when key redo/undo works mysql-test/r/maria-recovery.result: Updated tests now when key redo/undo works storage/maria/ma_blockrec.c: Use unified CLR code Added rec_lsn for full pages Moved clr write hook to ma_key_recover.c Changed REDO code to keep pages pinned until undo Mark page_link's as changed storage/maria/ma_blockrec.h: Moved write_hook_for_clr_end() to ma_key_recover.c storage/maria/ma_check.c: Changed key check code to use PAGECACHE_READ_UNKNOWN_PAGE Fixed wrong warning when checking files after maria_pack When unpacking files, we have to use new keypos_to_recpos method When doing repair, we can disregard index key file pages in page cache storage/maria/ma_commit.c: Added simple enable/disable logging functions (Needed for recovery) storage/maria/ma_control_file.c: Make maria control file extendable without having to make it incompatible for older versions storage/maria/ma_control_file.h: New error messages Added CONTROL_FILE_VERSION storage/maria/ma_delete.c: Added redo/undo for key pages change_length -> changed_length to make things similar More comments & more DBUG storage/maria/ma_key_recover.c: Unified CLR method Moved here write_hook_for_clr_end() and common keypage log functions Changed REDO to keep pages pinned until undo Changed UNDO code to change key_root under log mutex storage/maria/ma_key_recover.h: New structures and functions storage/maria/ma_loghandler.c: Include needed files storage/maria/ma_open.c: Change maria_open() to use pread() instead of read() storage/maria/ma_page.c: Fixed bug in key_del handling Clear pages if IDENTICAL_PAGES_AFTER_RECOVERY is defined storage/maria/ma_pagecache.c: Indentation and spelling fixes More DBUG Added helper function: pagecache_block_link_to_buffer() storage/maria/ma_pagecache.h: Added pagecache_block_link_to_buffer() storage/maria/ma_recovery.c: Fixed state.changed Fixed that REDO keeps pages pinned until UNDO Some bug fixes from previous commit Fixes for UNDO/REDO of key pages storage/maria/ma_search.c: Fixed packing and storing of keys to provide more information to caller so that we can do efficent REDO logging of the changes. storage/maria/ma_test1.c: Fixed bug with not initialized variable storage/maria/ma_test2.c: Removed not used code storage/maria/ma_test_all.res: Updated results storage/maria/ma_test_all.sh: Changed one test to test more Removed timing tests as not relevant here storage/maria/ma_test_recovery.expected: Updated test result after redo/undo if key pages works storage/maria/ma_test_recovery: Updated test after redo/undo if key pages works storage/maria/ma_write.c: Moved some general log functions to ma_key_recover.c Fixed some bugs in undo Moved ma_log_split() to _ma_split_page() Small changes in some function arguments to be able to do redo logging storage/maria/maria_chk.c: disable logging while doing repair table storage/maria/maria_def.h: New function prototypes Move some structs and functions to ma_key_recover.c storage/maria/unittest/ma_control_file-t.c: Updated with patch from Sanja NOTE: This is not complete and need to be updated to new control file format storage/maria/unittest/ma_test_loghandler-t.c: Fixed compiler warning
2007-11-20 16:42:16 +01:00
share->state.changed|= (STATE_CHANGED | STATE_NOT_ANALYZED |
STATE_NOT_OPTIMIZED_ROWS);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
tprint(tracef, " rows' count %lu\n", (ulong)share->state.state.records);
* WL#4137 Maria- Framework for testing recovery in mysql-test-run See test maria-recovery.test for a model; all include scripts have an "API" section at start if they do take parameters from outside. * Fixing bug reported by Jani and Monty (when two REDOs about the same page in one group, see ma_blockrec.c). * Fixing small bugs in recovery mysql-test/include/wait_until_connected_again.inc: be sure to enter the loop (the previous query by the caller may not have failed: it could be query; mysqladmin shutdown; call this script). mysql-test/lib/mtr_process.pl: * Through the "expect" file a test can tell mtr that a server crash is expected. What the file contains is irrelevant. Now if its last line starts with "wait", mtr will wait before restarting (it will wait for the last line to not start with "wait"). This is for tests which need to mangle files under the feet of a dead mysqld. * Remove "expect" file before restarting; otherwise there could be a race condition: tests sees server restarted, does something, writes an "expect" file, and then mtr removes that file, then test kills mysqld, and then mtr will never restart it. storage/maria/ma_blockrec.c: - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - fixing bug in applying of REDO_PURGE_BLOCKS in recovery: page_range sometimes has TAIL_BIT set, need to turn it down to know the real page range. - Both bugs are covered in maria-recovery.test storage/maria/ma_checkpoint.c: Capability to, in debug builds only, do some special operations (flush all bitmap and data pages, flush state, flush log) and crash mysqld, to later test recovery. Driven by some --debug=d, symbols. storage/maria/ma_open.c: debugging info storage/maria/ma_pagecache.c: Now that we can _ma_unpin_all_pages() during the REDO phase to set page's LSN, the assertion needs to be relaxed. storage/maria/ma_recovery.c: - open trace file in append mode (useful when a test triggers several recoveries, we see them all). - fixing wrong error detection, it's possible that during recovery we want to open an already open table. - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - we verify that all log records of a group are about the same table, for debugging. mysql-test/r/maria-recovery.result: result mysql-test/t/maria-recovery-master.opt: crash is expected, core file would take room, stack trace would wake pushbuild up. mysql-test/t/maria-recovery.test: Test of recovery from mysql-test (it is already tested as unit tests in ma_test_recovery) (WL#4137) - test that, if recovery is made to start on an empty table it can replay the effects of committed and uncommitted statements (having only the committed ones in the end result). This should be the first test for someone writing code of new REDOs. - test that, if mysqld is crashed and recovery runs we have only committed statements in the end result. Crashes are done in different ways: flush nothing (so, uncommitted statement is often missing from the log => no rollback to do); flush pagecache (implicitely flushes log (WAL)) and flush log, both causes rollbacks; flush log can also flush state (state.records etc) to test recovery of the state (not tested well now as we repair the index anyway). - test of bug found by Jani and Monty in recovery (two REDO about the same page in one group). mysql-test/include/maria_empty_logs.inc: removes logs, to have a clean sheet for testing recovery. mysql-test/include/maria_make_snapshot.inc: copies a table to another directory, or back, or compares both (comparison is not implemented as physical comparison is impossible if an UNDO phase happened). mysql-test/include/maria_make_snapshot_for_comparison.inc: copies tables to another directory so that they can later serve as a comparison reference (they are the good tables, recovery should produce similar ones). mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: When we want to force recovery to start on old tables, we prepare old tables with this script: we put them in a spare directory. They are later copied back over mysqltest tables while mysqld is dead. We also need to copy back the control file, otherwise mysqld, in recovery, would start from the latest checkpoint: latest checkpoint plus old tables is not a recovery-possible scenario of course. mysql-test/include/maria_verify_recovery.inc: causes mysqld to crash, restores old tables if requested, lets recovery run, compares resulting tables with reference tables by using CHECKSUM TABLE. We don't do any sanity checks on page's LSN in resulting tables, yet.
2007-11-13 17:12:29 +01:00
_ma_unpin_all_pages(info, rec->lsn);
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
return 0;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
prototype_redo_exec_hook(UNDO_ROW_UPDATE)
{
MARIA_HA *info= get_MARIA_HA_from_UNDO_record(rec);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
MARIA_SHARE *share;
if (info == NULL)
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
return 0;
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
share= info->s;
set_undo_lsn_for_active_trans(rec->short_trid, rec->lsn);
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
if (cmp_translog_addr(rec->lsn, share->state.is_of_horizon) >= 0)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
if (share->calc_checksum)
{
uchar buff[HA_CHECKSUM_STORE_SIZE];
if (translog_read_record(rec->lsn, LSN_STORE_SIZE + FILEID_STORE_SIZE +
PAGE_STORE_SIZE + DIRPOS_STORE_SIZE,
HA_CHECKSUM_STORE_SIZE, buff, NULL) !=
HA_CHECKSUM_STORE_SIZE)
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to read record\n");
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
return 1;
}
share->state.state.checksum+= ha_checksum_korr(buff);
}
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
share->state.changed|= STATE_CHANGED | STATE_NOT_ANALYZED;
}
_ma_unpin_all_pages(info, rec->lsn);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
return 0;
}
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
prototype_redo_exec_hook(UNDO_KEY_INSERT)
{
Fixes for redo/undo logging of key pages New extendable format for maria_log_control file Fixed some compiler warnings include/maria.h: Added maria_disable_logging() and maria_enable_logging() mysql-test/include/maria_verify_recovery.inc: Updated tests now when key redo/undo works mysql-test/r/maria-recovery.result: Updated tests now when key redo/undo works storage/maria/ma_blockrec.c: Use unified CLR code Added rec_lsn for full pages Moved clr write hook to ma_key_recover.c Changed REDO code to keep pages pinned until undo Mark page_link's as changed storage/maria/ma_blockrec.h: Moved write_hook_for_clr_end() to ma_key_recover.c storage/maria/ma_check.c: Changed key check code to use PAGECACHE_READ_UNKNOWN_PAGE Fixed wrong warning when checking files after maria_pack When unpacking files, we have to use new keypos_to_recpos method When doing repair, we can disregard index key file pages in page cache storage/maria/ma_commit.c: Added simple enable/disable logging functions (Needed for recovery) storage/maria/ma_control_file.c: Make maria control file extendable without having to make it incompatible for older versions storage/maria/ma_control_file.h: New error messages Added CONTROL_FILE_VERSION storage/maria/ma_delete.c: Added redo/undo for key pages change_length -> changed_length to make things similar More comments & more DBUG storage/maria/ma_key_recover.c: Unified CLR method Moved here write_hook_for_clr_end() and common keypage log functions Changed REDO to keep pages pinned until undo Changed UNDO code to change key_root under log mutex storage/maria/ma_key_recover.h: New structures and functions storage/maria/ma_loghandler.c: Include needed files storage/maria/ma_open.c: Change maria_open() to use pread() instead of read() storage/maria/ma_page.c: Fixed bug in key_del handling Clear pages if IDENTICAL_PAGES_AFTER_RECOVERY is defined storage/maria/ma_pagecache.c: Indentation and spelling fixes More DBUG Added helper function: pagecache_block_link_to_buffer() storage/maria/ma_pagecache.h: Added pagecache_block_link_to_buffer() storage/maria/ma_recovery.c: Fixed state.changed Fixed that REDO keeps pages pinned until UNDO Some bug fixes from previous commit Fixes for UNDO/REDO of key pages storage/maria/ma_search.c: Fixed packing and storing of keys to provide more information to caller so that we can do efficent REDO logging of the changes. storage/maria/ma_test1.c: Fixed bug with not initialized variable storage/maria/ma_test2.c: Removed not used code storage/maria/ma_test_all.res: Updated results storage/maria/ma_test_all.sh: Changed one test to test more Removed timing tests as not relevant here storage/maria/ma_test_recovery.expected: Updated test result after redo/undo if key pages works storage/maria/ma_test_recovery: Updated test after redo/undo if key pages works storage/maria/ma_write.c: Moved some general log functions to ma_key_recover.c Fixed some bugs in undo Moved ma_log_split() to _ma_split_page() Small changes in some function arguments to be able to do redo logging storage/maria/maria_chk.c: disable logging while doing repair table storage/maria/maria_def.h: New function prototypes Move some structs and functions to ma_key_recover.c storage/maria/unittest/ma_control_file-t.c: Updated with patch from Sanja NOTE: This is not complete and need to be updated to new control file format storage/maria/unittest/ma_test_loghandler-t.c: Fixed compiler warning
2007-11-20 16:42:16 +01:00
MARIA_HA *info;
if (!(info= get_MARIA_HA_from_UNDO_record(rec)))
return 0;
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
set_undo_lsn_for_active_trans(rec->short_trid, rec->lsn);
_ma_unpin_all_pages(info, rec->lsn);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
return 0;
}
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
prototype_redo_exec_hook(UNDO_KEY_DELETE)
{
Fixes for redo/undo logging of key pages New extendable format for maria_log_control file Fixed some compiler warnings include/maria.h: Added maria_disable_logging() and maria_enable_logging() mysql-test/include/maria_verify_recovery.inc: Updated tests now when key redo/undo works mysql-test/r/maria-recovery.result: Updated tests now when key redo/undo works storage/maria/ma_blockrec.c: Use unified CLR code Added rec_lsn for full pages Moved clr write hook to ma_key_recover.c Changed REDO code to keep pages pinned until undo Mark page_link's as changed storage/maria/ma_blockrec.h: Moved write_hook_for_clr_end() to ma_key_recover.c storage/maria/ma_check.c: Changed key check code to use PAGECACHE_READ_UNKNOWN_PAGE Fixed wrong warning when checking files after maria_pack When unpacking files, we have to use new keypos_to_recpos method When doing repair, we can disregard index key file pages in page cache storage/maria/ma_commit.c: Added simple enable/disable logging functions (Needed for recovery) storage/maria/ma_control_file.c: Make maria control file extendable without having to make it incompatible for older versions storage/maria/ma_control_file.h: New error messages Added CONTROL_FILE_VERSION storage/maria/ma_delete.c: Added redo/undo for key pages change_length -> changed_length to make things similar More comments & more DBUG storage/maria/ma_key_recover.c: Unified CLR method Moved here write_hook_for_clr_end() and common keypage log functions Changed REDO to keep pages pinned until undo Changed UNDO code to change key_root under log mutex storage/maria/ma_key_recover.h: New structures and functions storage/maria/ma_loghandler.c: Include needed files storage/maria/ma_open.c: Change maria_open() to use pread() instead of read() storage/maria/ma_page.c: Fixed bug in key_del handling Clear pages if IDENTICAL_PAGES_AFTER_RECOVERY is defined storage/maria/ma_pagecache.c: Indentation and spelling fixes More DBUG Added helper function: pagecache_block_link_to_buffer() storage/maria/ma_pagecache.h: Added pagecache_block_link_to_buffer() storage/maria/ma_recovery.c: Fixed state.changed Fixed that REDO keeps pages pinned until UNDO Some bug fixes from previous commit Fixes for UNDO/REDO of key pages storage/maria/ma_search.c: Fixed packing and storing of keys to provide more information to caller so that we can do efficent REDO logging of the changes. storage/maria/ma_test1.c: Fixed bug with not initialized variable storage/maria/ma_test2.c: Removed not used code storage/maria/ma_test_all.res: Updated results storage/maria/ma_test_all.sh: Changed one test to test more Removed timing tests as not relevant here storage/maria/ma_test_recovery.expected: Updated test result after redo/undo if key pages works storage/maria/ma_test_recovery: Updated test after redo/undo if key pages works storage/maria/ma_write.c: Moved some general log functions to ma_key_recover.c Fixed some bugs in undo Moved ma_log_split() to _ma_split_page() Small changes in some function arguments to be able to do redo logging storage/maria/maria_chk.c: disable logging while doing repair table storage/maria/maria_def.h: New function prototypes Move some structs and functions to ma_key_recover.c storage/maria/unittest/ma_control_file-t.c: Updated with patch from Sanja NOTE: This is not complete and need to be updated to new control file format storage/maria/unittest/ma_test_loghandler-t.c: Fixed compiler warning
2007-11-20 16:42:16 +01:00
MARIA_HA *info;
if (!(info= get_MARIA_HA_from_UNDO_record(rec)))
return 0;
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
set_undo_lsn_for_active_trans(rec->short_trid, rec->lsn);
_ma_unpin_all_pages(info, rec->lsn);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
return 0;
}
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
prototype_redo_exec_hook(UNDO_KEY_DELETE_WITH_ROOT)
{
MARIA_HA *info= get_MARIA_HA_from_UNDO_record(rec);
MARIA_SHARE *share;
if (info == NULL)
return 0;
share= info->s;
set_undo_lsn_for_active_trans(rec->short_trid, rec->lsn);
if (cmp_translog_addr(rec->lsn, share->state.is_of_horizon) >= 0)
{
uint key_nr;
my_off_t page;
Fixes for redo/undo logging of key pages New extendable format for maria_log_control file Fixed some compiler warnings include/maria.h: Added maria_disable_logging() and maria_enable_logging() mysql-test/include/maria_verify_recovery.inc: Updated tests now when key redo/undo works mysql-test/r/maria-recovery.result: Updated tests now when key redo/undo works storage/maria/ma_blockrec.c: Use unified CLR code Added rec_lsn for full pages Moved clr write hook to ma_key_recover.c Changed REDO code to keep pages pinned until undo Mark page_link's as changed storage/maria/ma_blockrec.h: Moved write_hook_for_clr_end() to ma_key_recover.c storage/maria/ma_check.c: Changed key check code to use PAGECACHE_READ_UNKNOWN_PAGE Fixed wrong warning when checking files after maria_pack When unpacking files, we have to use new keypos_to_recpos method When doing repair, we can disregard index key file pages in page cache storage/maria/ma_commit.c: Added simple enable/disable logging functions (Needed for recovery) storage/maria/ma_control_file.c: Make maria control file extendable without having to make it incompatible for older versions storage/maria/ma_control_file.h: New error messages Added CONTROL_FILE_VERSION storage/maria/ma_delete.c: Added redo/undo for key pages change_length -> changed_length to make things similar More comments & more DBUG storage/maria/ma_key_recover.c: Unified CLR method Moved here write_hook_for_clr_end() and common keypage log functions Changed REDO to keep pages pinned until undo Changed UNDO code to change key_root under log mutex storage/maria/ma_key_recover.h: New structures and functions storage/maria/ma_loghandler.c: Include needed files storage/maria/ma_open.c: Change maria_open() to use pread() instead of read() storage/maria/ma_page.c: Fixed bug in key_del handling Clear pages if IDENTICAL_PAGES_AFTER_RECOVERY is defined storage/maria/ma_pagecache.c: Indentation and spelling fixes More DBUG Added helper function: pagecache_block_link_to_buffer() storage/maria/ma_pagecache.h: Added pagecache_block_link_to_buffer() storage/maria/ma_recovery.c: Fixed state.changed Fixed that REDO keeps pages pinned until UNDO Some bug fixes from previous commit Fixes for UNDO/REDO of key pages storage/maria/ma_search.c: Fixed packing and storing of keys to provide more information to caller so that we can do efficent REDO logging of the changes. storage/maria/ma_test1.c: Fixed bug with not initialized variable storage/maria/ma_test2.c: Removed not used code storage/maria/ma_test_all.res: Updated results storage/maria/ma_test_all.sh: Changed one test to test more Removed timing tests as not relevant here storage/maria/ma_test_recovery.expected: Updated test result after redo/undo if key pages works storage/maria/ma_test_recovery: Updated test after redo/undo if key pages works storage/maria/ma_write.c: Moved some general log functions to ma_key_recover.c Fixed some bugs in undo Moved ma_log_split() to _ma_split_page() Small changes in some function arguments to be able to do redo logging storage/maria/maria_chk.c: disable logging while doing repair table storage/maria/maria_def.h: New function prototypes Move some structs and functions to ma_key_recover.c storage/maria/unittest/ma_control_file-t.c: Updated with patch from Sanja NOTE: This is not complete and need to be updated to new control file format storage/maria/unittest/ma_test_loghandler-t.c: Fixed compiler warning
2007-11-20 16:42:16 +01:00
key_nr= key_nr_korr(rec->header + LSN_STORE_SIZE + FILEID_STORE_SIZE);
page= page_korr(rec->header + LSN_STORE_SIZE + FILEID_STORE_SIZE +
KEY_NR_STORE_SIZE);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
share->state.key_root[key_nr]= (page == IMPOSSIBLE_PAGE_NO ?
HA_OFFSET_ERROR :
page * share->block_size);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
* WL#4137 Maria- Framework for testing recovery in mysql-test-run See test maria-recovery.test for a model; all include scripts have an "API" section at start if they do take parameters from outside. * Fixing bug reported by Jani and Monty (when two REDOs about the same page in one group, see ma_blockrec.c). * Fixing small bugs in recovery mysql-test/include/wait_until_connected_again.inc: be sure to enter the loop (the previous query by the caller may not have failed: it could be query; mysqladmin shutdown; call this script). mysql-test/lib/mtr_process.pl: * Through the "expect" file a test can tell mtr that a server crash is expected. What the file contains is irrelevant. Now if its last line starts with "wait", mtr will wait before restarting (it will wait for the last line to not start with "wait"). This is for tests which need to mangle files under the feet of a dead mysqld. * Remove "expect" file before restarting; otherwise there could be a race condition: tests sees server restarted, does something, writes an "expect" file, and then mtr removes that file, then test kills mysqld, and then mtr will never restart it. storage/maria/ma_blockrec.c: - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - fixing bug in applying of REDO_PURGE_BLOCKS in recovery: page_range sometimes has TAIL_BIT set, need to turn it down to know the real page range. - Both bugs are covered in maria-recovery.test storage/maria/ma_checkpoint.c: Capability to, in debug builds only, do some special operations (flush all bitmap and data pages, flush state, flush log) and crash mysqld, to later test recovery. Driven by some --debug=d, symbols. storage/maria/ma_open.c: debugging info storage/maria/ma_pagecache.c: Now that we can _ma_unpin_all_pages() during the REDO phase to set page's LSN, the assertion needs to be relaxed. storage/maria/ma_recovery.c: - open trace file in append mode (useful when a test triggers several recoveries, we see them all). - fixing wrong error detection, it's possible that during recovery we want to open an already open table. - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - we verify that all log records of a group are about the same table, for debugging. mysql-test/r/maria-recovery.result: result mysql-test/t/maria-recovery-master.opt: crash is expected, core file would take room, stack trace would wake pushbuild up. mysql-test/t/maria-recovery.test: Test of recovery from mysql-test (it is already tested as unit tests in ma_test_recovery) (WL#4137) - test that, if recovery is made to start on an empty table it can replay the effects of committed and uncommitted statements (having only the committed ones in the end result). This should be the first test for someone writing code of new REDOs. - test that, if mysqld is crashed and recovery runs we have only committed statements in the end result. Crashes are done in different ways: flush nothing (so, uncommitted statement is often missing from the log => no rollback to do); flush pagecache (implicitely flushes log (WAL)) and flush log, both causes rollbacks; flush log can also flush state (state.records etc) to test recovery of the state (not tested well now as we repair the index anyway). - test of bug found by Jani and Monty in recovery (two REDO about the same page in one group). mysql-test/include/maria_empty_logs.inc: removes logs, to have a clean sheet for testing recovery. mysql-test/include/maria_make_snapshot.inc: copies a table to another directory, or back, or compares both (comparison is not implemented as physical comparison is impossible if an UNDO phase happened). mysql-test/include/maria_make_snapshot_for_comparison.inc: copies tables to another directory so that they can later serve as a comparison reference (they are the good tables, recovery should produce similar ones). mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: When we want to force recovery to start on old tables, we prepare old tables with this script: we put them in a spare directory. They are later copied back over mysqltest tables while mysqld is dead. We also need to copy back the control file, otherwise mysqld, in recovery, would start from the latest checkpoint: latest checkpoint plus old tables is not a recovery-possible scenario of course. mysql-test/include/maria_verify_recovery.inc: causes mysqld to crash, restores old tables if requested, lets recovery run, compares resulting tables with reference tables by using CHECKSUM TABLE. We don't do any sanity checks on page's LSN in resulting tables, yet.
2007-11-13 17:12:29 +01:00
_ma_unpin_all_pages(info, rec->lsn);
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
return 0;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
prototype_redo_exec_hook(COMMIT)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
uint16 sid= rec->short_trid;
TrID long_trid= all_active_trans[sid].long_trid;
char llbuf[22];
if (long_trid == 0)
{
tprint(tracef, "We don't know about transaction with short_trid %u;"
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
"it probably committed long ago, forget it\n", sid);
return 0;
}
llstr(long_trid, llbuf);
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
tprint(tracef, "Transaction long_trid %s short_trid %u committed\n",
llbuf, sid);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
bzero(&all_active_trans[sid], sizeof(all_active_trans[sid]));
#ifdef MARIA_VERSIONING
/*
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if real recovery:
transaction was committed, move it to some separate list for later
purging (but don't purge now! purging may have been started before, we
may find REDO_PURGE records soon).
*/
#endif
return 0;
}
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
prototype_redo_exec_hook(CLR_END)
{
MARIA_HA *info= get_MARIA_HA_from_UNDO_record(rec);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
MARIA_SHARE *share;
LSN previous_undo_lsn;
enum translog_record_type undone_record_type;
const LOG_DESC *log_desc;
Fixes for redo/undo logging of key pages New extendable format for maria_log_control file Fixed some compiler warnings include/maria.h: Added maria_disable_logging() and maria_enable_logging() mysql-test/include/maria_verify_recovery.inc: Updated tests now when key redo/undo works mysql-test/r/maria-recovery.result: Updated tests now when key redo/undo works storage/maria/ma_blockrec.c: Use unified CLR code Added rec_lsn for full pages Moved clr write hook to ma_key_recover.c Changed REDO code to keep pages pinned until undo Mark page_link's as changed storage/maria/ma_blockrec.h: Moved write_hook_for_clr_end() to ma_key_recover.c storage/maria/ma_check.c: Changed key check code to use PAGECACHE_READ_UNKNOWN_PAGE Fixed wrong warning when checking files after maria_pack When unpacking files, we have to use new keypos_to_recpos method When doing repair, we can disregard index key file pages in page cache storage/maria/ma_commit.c: Added simple enable/disable logging functions (Needed for recovery) storage/maria/ma_control_file.c: Make maria control file extendable without having to make it incompatible for older versions storage/maria/ma_control_file.h: New error messages Added CONTROL_FILE_VERSION storage/maria/ma_delete.c: Added redo/undo for key pages change_length -> changed_length to make things similar More comments & more DBUG storage/maria/ma_key_recover.c: Unified CLR method Moved here write_hook_for_clr_end() and common keypage log functions Changed REDO to keep pages pinned until undo Changed UNDO code to change key_root under log mutex storage/maria/ma_key_recover.h: New structures and functions storage/maria/ma_loghandler.c: Include needed files storage/maria/ma_open.c: Change maria_open() to use pread() instead of read() storage/maria/ma_page.c: Fixed bug in key_del handling Clear pages if IDENTICAL_PAGES_AFTER_RECOVERY is defined storage/maria/ma_pagecache.c: Indentation and spelling fixes More DBUG Added helper function: pagecache_block_link_to_buffer() storage/maria/ma_pagecache.h: Added pagecache_block_link_to_buffer() storage/maria/ma_recovery.c: Fixed state.changed Fixed that REDO keeps pages pinned until UNDO Some bug fixes from previous commit Fixes for UNDO/REDO of key pages storage/maria/ma_search.c: Fixed packing and storing of keys to provide more information to caller so that we can do efficent REDO logging of the changes. storage/maria/ma_test1.c: Fixed bug with not initialized variable storage/maria/ma_test2.c: Removed not used code storage/maria/ma_test_all.res: Updated results storage/maria/ma_test_all.sh: Changed one test to test more Removed timing tests as not relevant here storage/maria/ma_test_recovery.expected: Updated test result after redo/undo if key pages works storage/maria/ma_test_recovery: Updated test after redo/undo if key pages works storage/maria/ma_write.c: Moved some general log functions to ma_key_recover.c Fixed some bugs in undo Moved ma_log_split() to _ma_split_page() Small changes in some function arguments to be able to do redo logging storage/maria/maria_chk.c: disable logging while doing repair table storage/maria/maria_def.h: New function prototypes Move some structs and functions to ma_key_recover.c storage/maria/unittest/ma_control_file-t.c: Updated with patch from Sanja NOTE: This is not complete and need to be updated to new control file format storage/maria/unittest/ma_test_loghandler-t.c: Fixed compiler warning
2007-11-20 16:42:16 +01:00
my_bool row_entry= 0;
Fixed repair_by_sort to work with BLOCK_RECORD Fixed bugs in undo logging Fixed bug where head block was split before min_row_length (caused Maria to believe row was crashed on read) Reserved place for reference-transid on key pages (for packing of transids) ALTER TABLE and INSERT ... SELECT now uses fast creation of index Known bugs: ma_test_recovery fails because of a bug in redo handling when log is cut directly after a redo (Guilhem knows how to fix) ma_test_recovery.excepted is not totally correct, because of the above bug mysqld sometimes fails to restart; Fails with error "end_of_redo_phase: Assertion `long_trid != 0' failed"; Guilhem to investigate include/maria.h: Prototype changes Added current_filepos to st_maria_sort_info mysql-test/r/maria.result: Updated results that changes as alter table and insert ... select now uses fast creation of index mysys/mf_iocache.c: Reset variable to gurard against double invocation storage/maria/ma_bitmap.c: Added _ma_bitmap_reset_cache() (needed for repair) storage/maria/ma_blockrec.c: Simplify code More initial allocations Fixed bug where head block was split before min_row_length (caused Maria to believe row was crashed on read) storage/maria/ma_blockrec.h: Moved TRANSID_SIZE to maria_def.h Added prototype for new functions storage/maria/ma_check.c: Simplicy code Fixed repair_by_sort to work with BLOCK_RECORD - When using BLOCK_RECORD or UNPACK create new Maria handle - Use common initializer function - Align code with maria_repair() Made some changes to maria_repair_parallel() to use common initializer function Removed ASK_MONTY section by fixing noted problem storage/maria/ma_close.c: Moved check for readonly to _ma_state_info_write() storage/maria/ma_key_recover.c: Use different log entries if key root changes or not. This fixed some bugs when tree grows storage/maria/ma_key_recover.h: Added keynr to st_msg_to_write_hook_for_undo_key storage/maria/ma_loghandler.c: Added INIT_LOGREC_UNDO_KEY_INSERT_WITH_ROOT storage/maria/ma_loghandler.h: Added INIT_LOGREC_UNDO_KEY_INSERT_WITH_ROOT storage/maria/ma_open.c: Added TRANSID to all key pages (for future compressing of trans id's) For compressed records, alloc a bit bigger buffer to avoid valgrind warnings If table is opened readonly, don't update state storage/maria/ma_packrec.c: Allocate bigger array for bit unpacking to avoid valgrind errors storage/maria/ma_recovery.c: Added UNDO_KEY_INSERT_WITH_ROOT & UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_sort.c: More logging storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Update results Note that this is not complete becasue of a bug in recovery storage/maria/ma_test_recovery: Removed recreation of index (not needed when we have redo for index pages) storage/maria/maria_chk.c: When using flag --read-only, don't update status for files When using --unpack, don't use REPAIR_BY_SORT if other repair option is given Enable repair_by_sort for BLOCK records Removed not needed newline at start of --describe storage/maria/maria_def.h: Support for TRANSID_SIZE to key pages storage/maria/maria_read_log.c: renamed --only-display to --display-only
2007-11-28 20:38:30 +01:00
DBUG_ENTER("exec_REDO_LOGREC_CLR_END");
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
if (info == NULL)
Fixed repair_by_sort to work with BLOCK_RECORD Fixed bugs in undo logging Fixed bug where head block was split before min_row_length (caused Maria to believe row was crashed on read) Reserved place for reference-transid on key pages (for packing of transids) ALTER TABLE and INSERT ... SELECT now uses fast creation of index Known bugs: ma_test_recovery fails because of a bug in redo handling when log is cut directly after a redo (Guilhem knows how to fix) ma_test_recovery.excepted is not totally correct, because of the above bug mysqld sometimes fails to restart; Fails with error "end_of_redo_phase: Assertion `long_trid != 0' failed"; Guilhem to investigate include/maria.h: Prototype changes Added current_filepos to st_maria_sort_info mysql-test/r/maria.result: Updated results that changes as alter table and insert ... select now uses fast creation of index mysys/mf_iocache.c: Reset variable to gurard against double invocation storage/maria/ma_bitmap.c: Added _ma_bitmap_reset_cache() (needed for repair) storage/maria/ma_blockrec.c: Simplify code More initial allocations Fixed bug where head block was split before min_row_length (caused Maria to believe row was crashed on read) storage/maria/ma_blockrec.h: Moved TRANSID_SIZE to maria_def.h Added prototype for new functions storage/maria/ma_check.c: Simplicy code Fixed repair_by_sort to work with BLOCK_RECORD - When using BLOCK_RECORD or UNPACK create new Maria handle - Use common initializer function - Align code with maria_repair() Made some changes to maria_repair_parallel() to use common initializer function Removed ASK_MONTY section by fixing noted problem storage/maria/ma_close.c: Moved check for readonly to _ma_state_info_write() storage/maria/ma_key_recover.c: Use different log entries if key root changes or not. This fixed some bugs when tree grows storage/maria/ma_key_recover.h: Added keynr to st_msg_to_write_hook_for_undo_key storage/maria/ma_loghandler.c: Added INIT_LOGREC_UNDO_KEY_INSERT_WITH_ROOT storage/maria/ma_loghandler.h: Added INIT_LOGREC_UNDO_KEY_INSERT_WITH_ROOT storage/maria/ma_open.c: Added TRANSID to all key pages (for future compressing of trans id's) For compressed records, alloc a bit bigger buffer to avoid valgrind warnings If table is opened readonly, don't update state storage/maria/ma_packrec.c: Allocate bigger array for bit unpacking to avoid valgrind errors storage/maria/ma_recovery.c: Added UNDO_KEY_INSERT_WITH_ROOT & UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_sort.c: More logging storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Update results Note that this is not complete becasue of a bug in recovery storage/maria/ma_test_recovery: Removed recreation of index (not needed when we have redo for index pages) storage/maria/maria_chk.c: When using flag --read-only, don't update status for files When using --unpack, don't use REPAIR_BY_SORT if other repair option is given Enable repair_by_sort for BLOCK records Removed not needed newline at start of --describe storage/maria/maria_def.h: Support for TRANSID_SIZE to key pages storage/maria/maria_read_log.c: renamed --only-display to --display-only
2007-11-28 20:38:30 +01:00
DBUG_RETURN(0);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
share= info->s;
previous_undo_lsn= lsn_korr(rec->header);
undone_record_type=
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
clr_type_korr(rec->header + LSN_STORE_SIZE + FILEID_STORE_SIZE);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
log_desc= &log_record_type_descriptor[undone_record_type];
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
set_undo_lsn_for_active_trans(rec->short_trid, previous_undo_lsn);
tprint(tracef, " CLR_END was about %s, undo_lsn now LSN (%lu,0x%lx)\n",
log_desc->name, LSN_IN_PARTS(previous_undo_lsn));
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
if (cmp_translog_addr(rec->lsn, share->state.is_of_horizon) >= 0)
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
{
tprint(tracef, " state older than record\n");
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
switch (undone_record_type) {
case LOGREC_UNDO_ROW_DELETE:
Fixes for redo/undo logging of key pages New extendable format for maria_log_control file Fixed some compiler warnings include/maria.h: Added maria_disable_logging() and maria_enable_logging() mysql-test/include/maria_verify_recovery.inc: Updated tests now when key redo/undo works mysql-test/r/maria-recovery.result: Updated tests now when key redo/undo works storage/maria/ma_blockrec.c: Use unified CLR code Added rec_lsn for full pages Moved clr write hook to ma_key_recover.c Changed REDO code to keep pages pinned until undo Mark page_link's as changed storage/maria/ma_blockrec.h: Moved write_hook_for_clr_end() to ma_key_recover.c storage/maria/ma_check.c: Changed key check code to use PAGECACHE_READ_UNKNOWN_PAGE Fixed wrong warning when checking files after maria_pack When unpacking files, we have to use new keypos_to_recpos method When doing repair, we can disregard index key file pages in page cache storage/maria/ma_commit.c: Added simple enable/disable logging functions (Needed for recovery) storage/maria/ma_control_file.c: Make maria control file extendable without having to make it incompatible for older versions storage/maria/ma_control_file.h: New error messages Added CONTROL_FILE_VERSION storage/maria/ma_delete.c: Added redo/undo for key pages change_length -> changed_length to make things similar More comments & more DBUG storage/maria/ma_key_recover.c: Unified CLR method Moved here write_hook_for_clr_end() and common keypage log functions Changed REDO to keep pages pinned until undo Changed UNDO code to change key_root under log mutex storage/maria/ma_key_recover.h: New structures and functions storage/maria/ma_loghandler.c: Include needed files storage/maria/ma_open.c: Change maria_open() to use pread() instead of read() storage/maria/ma_page.c: Fixed bug in key_del handling Clear pages if IDENTICAL_PAGES_AFTER_RECOVERY is defined storage/maria/ma_pagecache.c: Indentation and spelling fixes More DBUG Added helper function: pagecache_block_link_to_buffer() storage/maria/ma_pagecache.h: Added pagecache_block_link_to_buffer() storage/maria/ma_recovery.c: Fixed state.changed Fixed that REDO keeps pages pinned until UNDO Some bug fixes from previous commit Fixes for UNDO/REDO of key pages storage/maria/ma_search.c: Fixed packing and storing of keys to provide more information to caller so that we can do efficent REDO logging of the changes. storage/maria/ma_test1.c: Fixed bug with not initialized variable storage/maria/ma_test2.c: Removed not used code storage/maria/ma_test_all.res: Updated results storage/maria/ma_test_all.sh: Changed one test to test more Removed timing tests as not relevant here storage/maria/ma_test_recovery.expected: Updated test result after redo/undo if key pages works storage/maria/ma_test_recovery: Updated test after redo/undo if key pages works storage/maria/ma_write.c: Moved some general log functions to ma_key_recover.c Fixed some bugs in undo Moved ma_log_split() to _ma_split_page() Small changes in some function arguments to be able to do redo logging storage/maria/maria_chk.c: disable logging while doing repair table storage/maria/maria_def.h: New function prototypes Move some structs and functions to ma_key_recover.c storage/maria/unittest/ma_control_file-t.c: Updated with patch from Sanja NOTE: This is not complete and need to be updated to new control file format storage/maria/unittest/ma_test_loghandler-t.c: Fixed compiler warning
2007-11-20 16:42:16 +01:00
row_entry= 1;
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
share->state.state.records++;
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
break;
case LOGREC_UNDO_ROW_INSERT:
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
share->state.state.records--;
Fixes for redo/undo logging of key pages New extendable format for maria_log_control file Fixed some compiler warnings include/maria.h: Added maria_disable_logging() and maria_enable_logging() mysql-test/include/maria_verify_recovery.inc: Updated tests now when key redo/undo works mysql-test/r/maria-recovery.result: Updated tests now when key redo/undo works storage/maria/ma_blockrec.c: Use unified CLR code Added rec_lsn for full pages Moved clr write hook to ma_key_recover.c Changed REDO code to keep pages pinned until undo Mark page_link's as changed storage/maria/ma_blockrec.h: Moved write_hook_for_clr_end() to ma_key_recover.c storage/maria/ma_check.c: Changed key check code to use PAGECACHE_READ_UNKNOWN_PAGE Fixed wrong warning when checking files after maria_pack When unpacking files, we have to use new keypos_to_recpos method When doing repair, we can disregard index key file pages in page cache storage/maria/ma_commit.c: Added simple enable/disable logging functions (Needed for recovery) storage/maria/ma_control_file.c: Make maria control file extendable without having to make it incompatible for older versions storage/maria/ma_control_file.h: New error messages Added CONTROL_FILE_VERSION storage/maria/ma_delete.c: Added redo/undo for key pages change_length -> changed_length to make things similar More comments & more DBUG storage/maria/ma_key_recover.c: Unified CLR method Moved here write_hook_for_clr_end() and common keypage log functions Changed REDO to keep pages pinned until undo Changed UNDO code to change key_root under log mutex storage/maria/ma_key_recover.h: New structures and functions storage/maria/ma_loghandler.c: Include needed files storage/maria/ma_open.c: Change maria_open() to use pread() instead of read() storage/maria/ma_page.c: Fixed bug in key_del handling Clear pages if IDENTICAL_PAGES_AFTER_RECOVERY is defined storage/maria/ma_pagecache.c: Indentation and spelling fixes More DBUG Added helper function: pagecache_block_link_to_buffer() storage/maria/ma_pagecache.h: Added pagecache_block_link_to_buffer() storage/maria/ma_recovery.c: Fixed state.changed Fixed that REDO keeps pages pinned until UNDO Some bug fixes from previous commit Fixes for UNDO/REDO of key pages storage/maria/ma_search.c: Fixed packing and storing of keys to provide more information to caller so that we can do efficent REDO logging of the changes. storage/maria/ma_test1.c: Fixed bug with not initialized variable storage/maria/ma_test2.c: Removed not used code storage/maria/ma_test_all.res: Updated results storage/maria/ma_test_all.sh: Changed one test to test more Removed timing tests as not relevant here storage/maria/ma_test_recovery.expected: Updated test result after redo/undo if key pages works storage/maria/ma_test_recovery: Updated test after redo/undo if key pages works storage/maria/ma_write.c: Moved some general log functions to ma_key_recover.c Fixed some bugs in undo Moved ma_log_split() to _ma_split_page() Small changes in some function arguments to be able to do redo logging storage/maria/maria_chk.c: disable logging while doing repair table storage/maria/maria_def.h: New function prototypes Move some structs and functions to ma_key_recover.c storage/maria/unittest/ma_control_file-t.c: Updated with patch from Sanja NOTE: This is not complete and need to be updated to new control file format storage/maria/unittest/ma_test_loghandler-t.c: Fixed compiler warning
2007-11-20 16:42:16 +01:00
row_entry= 1;
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
break;
case LOGREC_UNDO_ROW_UPDATE:
Fixes for redo/undo logging of key pages New extendable format for maria_log_control file Fixed some compiler warnings include/maria.h: Added maria_disable_logging() and maria_enable_logging() mysql-test/include/maria_verify_recovery.inc: Updated tests now when key redo/undo works mysql-test/r/maria-recovery.result: Updated tests now when key redo/undo works storage/maria/ma_blockrec.c: Use unified CLR code Added rec_lsn for full pages Moved clr write hook to ma_key_recover.c Changed REDO code to keep pages pinned until undo Mark page_link's as changed storage/maria/ma_blockrec.h: Moved write_hook_for_clr_end() to ma_key_recover.c storage/maria/ma_check.c: Changed key check code to use PAGECACHE_READ_UNKNOWN_PAGE Fixed wrong warning when checking files after maria_pack When unpacking files, we have to use new keypos_to_recpos method When doing repair, we can disregard index key file pages in page cache storage/maria/ma_commit.c: Added simple enable/disable logging functions (Needed for recovery) storage/maria/ma_control_file.c: Make maria control file extendable without having to make it incompatible for older versions storage/maria/ma_control_file.h: New error messages Added CONTROL_FILE_VERSION storage/maria/ma_delete.c: Added redo/undo for key pages change_length -> changed_length to make things similar More comments & more DBUG storage/maria/ma_key_recover.c: Unified CLR method Moved here write_hook_for_clr_end() and common keypage log functions Changed REDO to keep pages pinned until undo Changed UNDO code to change key_root under log mutex storage/maria/ma_key_recover.h: New structures and functions storage/maria/ma_loghandler.c: Include needed files storage/maria/ma_open.c: Change maria_open() to use pread() instead of read() storage/maria/ma_page.c: Fixed bug in key_del handling Clear pages if IDENTICAL_PAGES_AFTER_RECOVERY is defined storage/maria/ma_pagecache.c: Indentation and spelling fixes More DBUG Added helper function: pagecache_block_link_to_buffer() storage/maria/ma_pagecache.h: Added pagecache_block_link_to_buffer() storage/maria/ma_recovery.c: Fixed state.changed Fixed that REDO keeps pages pinned until UNDO Some bug fixes from previous commit Fixes for UNDO/REDO of key pages storage/maria/ma_search.c: Fixed packing and storing of keys to provide more information to caller so that we can do efficent REDO logging of the changes. storage/maria/ma_test1.c: Fixed bug with not initialized variable storage/maria/ma_test2.c: Removed not used code storage/maria/ma_test_all.res: Updated results storage/maria/ma_test_all.sh: Changed one test to test more Removed timing tests as not relevant here storage/maria/ma_test_recovery.expected: Updated test result after redo/undo if key pages works storage/maria/ma_test_recovery: Updated test after redo/undo if key pages works storage/maria/ma_write.c: Moved some general log functions to ma_key_recover.c Fixed some bugs in undo Moved ma_log_split() to _ma_split_page() Small changes in some function arguments to be able to do redo logging storage/maria/maria_chk.c: disable logging while doing repair table storage/maria/maria_def.h: New function prototypes Move some structs and functions to ma_key_recover.c storage/maria/unittest/ma_control_file-t.c: Updated with patch from Sanja NOTE: This is not complete and need to be updated to new control file format storage/maria/unittest/ma_test_loghandler-t.c: Fixed compiler warning
2007-11-20 16:42:16 +01:00
row_entry= 1;
break;
case LOGREC_UNDO_KEY_INSERT:
case LOGREC_UNDO_KEY_DELETE:
break;
Fixed repair_by_sort to work with BLOCK_RECORD Fixed bugs in undo logging Fixed bug where head block was split before min_row_length (caused Maria to believe row was crashed on read) Reserved place for reference-transid on key pages (for packing of transids) ALTER TABLE and INSERT ... SELECT now uses fast creation of index Known bugs: ma_test_recovery fails because of a bug in redo handling when log is cut directly after a redo (Guilhem knows how to fix) ma_test_recovery.excepted is not totally correct, because of the above bug mysqld sometimes fails to restart; Fails with error "end_of_redo_phase: Assertion `long_trid != 0' failed"; Guilhem to investigate include/maria.h: Prototype changes Added current_filepos to st_maria_sort_info mysql-test/r/maria.result: Updated results that changes as alter table and insert ... select now uses fast creation of index mysys/mf_iocache.c: Reset variable to gurard against double invocation storage/maria/ma_bitmap.c: Added _ma_bitmap_reset_cache() (needed for repair) storage/maria/ma_blockrec.c: Simplify code More initial allocations Fixed bug where head block was split before min_row_length (caused Maria to believe row was crashed on read) storage/maria/ma_blockrec.h: Moved TRANSID_SIZE to maria_def.h Added prototype for new functions storage/maria/ma_check.c: Simplicy code Fixed repair_by_sort to work with BLOCK_RECORD - When using BLOCK_RECORD or UNPACK create new Maria handle - Use common initializer function - Align code with maria_repair() Made some changes to maria_repair_parallel() to use common initializer function Removed ASK_MONTY section by fixing noted problem storage/maria/ma_close.c: Moved check for readonly to _ma_state_info_write() storage/maria/ma_key_recover.c: Use different log entries if key root changes or not. This fixed some bugs when tree grows storage/maria/ma_key_recover.h: Added keynr to st_msg_to_write_hook_for_undo_key storage/maria/ma_loghandler.c: Added INIT_LOGREC_UNDO_KEY_INSERT_WITH_ROOT storage/maria/ma_loghandler.h: Added INIT_LOGREC_UNDO_KEY_INSERT_WITH_ROOT storage/maria/ma_open.c: Added TRANSID to all key pages (for future compressing of trans id's) For compressed records, alloc a bit bigger buffer to avoid valgrind warnings If table is opened readonly, don't update state storage/maria/ma_packrec.c: Allocate bigger array for bit unpacking to avoid valgrind errors storage/maria/ma_recovery.c: Added UNDO_KEY_INSERT_WITH_ROOT & UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_sort.c: More logging storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Update results Note that this is not complete becasue of a bug in recovery storage/maria/ma_test_recovery: Removed recreation of index (not needed when we have redo for index pages) storage/maria/maria_chk.c: When using flag --read-only, don't update status for files When using --unpack, don't use REPAIR_BY_SORT if other repair option is given Enable repair_by_sort for BLOCK records Removed not needed newline at start of --describe storage/maria/maria_def.h: Support for TRANSID_SIZE to key pages storage/maria/maria_read_log.c: renamed --only-display to --display-only
2007-11-28 20:38:30 +01:00
case LOGREC_UNDO_KEY_INSERT_WITH_ROOT:
case LOGREC_UNDO_KEY_DELETE_WITH_ROOT:
{
uint key_nr;
my_off_t page;
uchar buff[KEY_NR_STORE_SIZE + PAGE_STORE_SIZE];
if (translog_read_record(rec->lsn, LSN_STORE_SIZE + FILEID_STORE_SIZE +
CLR_TYPE_STORE_SIZE,
KEY_NR_STORE_SIZE + PAGE_STORE_SIZE,
buff, NULL) !=
KEY_NR_STORE_SIZE + PAGE_STORE_SIZE)
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to read record\n");
Fixed repair_by_sort to work with BLOCK_RECORD Fixed bugs in undo logging Fixed bug where head block was split before min_row_length (caused Maria to believe row was crashed on read) Reserved place for reference-transid on key pages (for packing of transids) ALTER TABLE and INSERT ... SELECT now uses fast creation of index Known bugs: ma_test_recovery fails because of a bug in redo handling when log is cut directly after a redo (Guilhem knows how to fix) ma_test_recovery.excepted is not totally correct, because of the above bug mysqld sometimes fails to restart; Fails with error "end_of_redo_phase: Assertion `long_trid != 0' failed"; Guilhem to investigate include/maria.h: Prototype changes Added current_filepos to st_maria_sort_info mysql-test/r/maria.result: Updated results that changes as alter table and insert ... select now uses fast creation of index mysys/mf_iocache.c: Reset variable to gurard against double invocation storage/maria/ma_bitmap.c: Added _ma_bitmap_reset_cache() (needed for repair) storage/maria/ma_blockrec.c: Simplify code More initial allocations Fixed bug where head block was split before min_row_length (caused Maria to believe row was crashed on read) storage/maria/ma_blockrec.h: Moved TRANSID_SIZE to maria_def.h Added prototype for new functions storage/maria/ma_check.c: Simplicy code Fixed repair_by_sort to work with BLOCK_RECORD - When using BLOCK_RECORD or UNPACK create new Maria handle - Use common initializer function - Align code with maria_repair() Made some changes to maria_repair_parallel() to use common initializer function Removed ASK_MONTY section by fixing noted problem storage/maria/ma_close.c: Moved check for readonly to _ma_state_info_write() storage/maria/ma_key_recover.c: Use different log entries if key root changes or not. This fixed some bugs when tree grows storage/maria/ma_key_recover.h: Added keynr to st_msg_to_write_hook_for_undo_key storage/maria/ma_loghandler.c: Added INIT_LOGREC_UNDO_KEY_INSERT_WITH_ROOT storage/maria/ma_loghandler.h: Added INIT_LOGREC_UNDO_KEY_INSERT_WITH_ROOT storage/maria/ma_open.c: Added TRANSID to all key pages (for future compressing of trans id's) For compressed records, alloc a bit bigger buffer to avoid valgrind warnings If table is opened readonly, don't update state storage/maria/ma_packrec.c: Allocate bigger array for bit unpacking to avoid valgrind errors storage/maria/ma_recovery.c: Added UNDO_KEY_INSERT_WITH_ROOT & UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_sort.c: More logging storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Update results Note that this is not complete becasue of a bug in recovery storage/maria/ma_test_recovery: Removed recreation of index (not needed when we have redo for index pages) storage/maria/maria_chk.c: When using flag --read-only, don't update status for files When using --unpack, don't use REPAIR_BY_SORT if other repair option is given Enable repair_by_sort for BLOCK records Removed not needed newline at start of --describe storage/maria/maria_def.h: Support for TRANSID_SIZE to key pages storage/maria/maria_read_log.c: renamed --only-display to --display-only
2007-11-28 20:38:30 +01:00
DBUG_RETURN(1);
}
key_nr= key_nr_korr(buff);
page= page_korr(buff + KEY_NR_STORE_SIZE);
share->state.key_root[key_nr]= (page == IMPOSSIBLE_PAGE_NO ?
HA_OFFSET_ERROR :
page * share->block_size);
break;
}
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
default:
DBUG_ASSERT(0);
}
Fixes for redo/undo logging of key pages New extendable format for maria_log_control file Fixed some compiler warnings include/maria.h: Added maria_disable_logging() and maria_enable_logging() mysql-test/include/maria_verify_recovery.inc: Updated tests now when key redo/undo works mysql-test/r/maria-recovery.result: Updated tests now when key redo/undo works storage/maria/ma_blockrec.c: Use unified CLR code Added rec_lsn for full pages Moved clr write hook to ma_key_recover.c Changed REDO code to keep pages pinned until undo Mark page_link's as changed storage/maria/ma_blockrec.h: Moved write_hook_for_clr_end() to ma_key_recover.c storage/maria/ma_check.c: Changed key check code to use PAGECACHE_READ_UNKNOWN_PAGE Fixed wrong warning when checking files after maria_pack When unpacking files, we have to use new keypos_to_recpos method When doing repair, we can disregard index key file pages in page cache storage/maria/ma_commit.c: Added simple enable/disable logging functions (Needed for recovery) storage/maria/ma_control_file.c: Make maria control file extendable without having to make it incompatible for older versions storage/maria/ma_control_file.h: New error messages Added CONTROL_FILE_VERSION storage/maria/ma_delete.c: Added redo/undo for key pages change_length -> changed_length to make things similar More comments & more DBUG storage/maria/ma_key_recover.c: Unified CLR method Moved here write_hook_for_clr_end() and common keypage log functions Changed REDO to keep pages pinned until undo Changed UNDO code to change key_root under log mutex storage/maria/ma_key_recover.h: New structures and functions storage/maria/ma_loghandler.c: Include needed files storage/maria/ma_open.c: Change maria_open() to use pread() instead of read() storage/maria/ma_page.c: Fixed bug in key_del handling Clear pages if IDENTICAL_PAGES_AFTER_RECOVERY is defined storage/maria/ma_pagecache.c: Indentation and spelling fixes More DBUG Added helper function: pagecache_block_link_to_buffer() storage/maria/ma_pagecache.h: Added pagecache_block_link_to_buffer() storage/maria/ma_recovery.c: Fixed state.changed Fixed that REDO keeps pages pinned until UNDO Some bug fixes from previous commit Fixes for UNDO/REDO of key pages storage/maria/ma_search.c: Fixed packing and storing of keys to provide more information to caller so that we can do efficent REDO logging of the changes. storage/maria/ma_test1.c: Fixed bug with not initialized variable storage/maria/ma_test2.c: Removed not used code storage/maria/ma_test_all.res: Updated results storage/maria/ma_test_all.sh: Changed one test to test more Removed timing tests as not relevant here storage/maria/ma_test_recovery.expected: Updated test result after redo/undo if key pages works storage/maria/ma_test_recovery: Updated test after redo/undo if key pages works storage/maria/ma_write.c: Moved some general log functions to ma_key_recover.c Fixed some bugs in undo Moved ma_log_split() to _ma_split_page() Small changes in some function arguments to be able to do redo logging storage/maria/maria_chk.c: disable logging while doing repair table storage/maria/maria_def.h: New function prototypes Move some structs and functions to ma_key_recover.c storage/maria/unittest/ma_control_file-t.c: Updated with patch from Sanja NOTE: This is not complete and need to be updated to new control file format storage/maria/unittest/ma_test_loghandler-t.c: Fixed compiler warning
2007-11-20 16:42:16 +01:00
if (row_entry && share->calc_checksum)
{
uchar buff[HA_CHECKSUM_STORE_SIZE];
if (translog_read_record(rec->lsn, LSN_STORE_SIZE + FILEID_STORE_SIZE +
CLR_TYPE_STORE_SIZE, HA_CHECKSUM_STORE_SIZE,
buff, NULL) != HA_CHECKSUM_STORE_SIZE)
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to read record\n");
Fixed repair_by_sort to work with BLOCK_RECORD Fixed bugs in undo logging Fixed bug where head block was split before min_row_length (caused Maria to believe row was crashed on read) Reserved place for reference-transid on key pages (for packing of transids) ALTER TABLE and INSERT ... SELECT now uses fast creation of index Known bugs: ma_test_recovery fails because of a bug in redo handling when log is cut directly after a redo (Guilhem knows how to fix) ma_test_recovery.excepted is not totally correct, because of the above bug mysqld sometimes fails to restart; Fails with error "end_of_redo_phase: Assertion `long_trid != 0' failed"; Guilhem to investigate include/maria.h: Prototype changes Added current_filepos to st_maria_sort_info mysql-test/r/maria.result: Updated results that changes as alter table and insert ... select now uses fast creation of index mysys/mf_iocache.c: Reset variable to gurard against double invocation storage/maria/ma_bitmap.c: Added _ma_bitmap_reset_cache() (needed for repair) storage/maria/ma_blockrec.c: Simplify code More initial allocations Fixed bug where head block was split before min_row_length (caused Maria to believe row was crashed on read) storage/maria/ma_blockrec.h: Moved TRANSID_SIZE to maria_def.h Added prototype for new functions storage/maria/ma_check.c: Simplicy code Fixed repair_by_sort to work with BLOCK_RECORD - When using BLOCK_RECORD or UNPACK create new Maria handle - Use common initializer function - Align code with maria_repair() Made some changes to maria_repair_parallel() to use common initializer function Removed ASK_MONTY section by fixing noted problem storage/maria/ma_close.c: Moved check for readonly to _ma_state_info_write() storage/maria/ma_key_recover.c: Use different log entries if key root changes or not. This fixed some bugs when tree grows storage/maria/ma_key_recover.h: Added keynr to st_msg_to_write_hook_for_undo_key storage/maria/ma_loghandler.c: Added INIT_LOGREC_UNDO_KEY_INSERT_WITH_ROOT storage/maria/ma_loghandler.h: Added INIT_LOGREC_UNDO_KEY_INSERT_WITH_ROOT storage/maria/ma_open.c: Added TRANSID to all key pages (for future compressing of trans id's) For compressed records, alloc a bit bigger buffer to avoid valgrind warnings If table is opened readonly, don't update state storage/maria/ma_packrec.c: Allocate bigger array for bit unpacking to avoid valgrind errors storage/maria/ma_recovery.c: Added UNDO_KEY_INSERT_WITH_ROOT & UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_sort.c: More logging storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Update results Note that this is not complete becasue of a bug in recovery storage/maria/ma_test_recovery: Removed recreation of index (not needed when we have redo for index pages) storage/maria/maria_chk.c: When using flag --read-only, don't update status for files When using --unpack, don't use REPAIR_BY_SORT if other repair option is given Enable repair_by_sort for BLOCK records Removed not needed newline at start of --describe storage/maria/maria_def.h: Support for TRANSID_SIZE to key pages storage/maria/maria_read_log.c: renamed --only-display to --display-only
2007-11-28 20:38:30 +01:00
DBUG_RETURN(1);
Fixes for redo/undo logging of key pages New extendable format for maria_log_control file Fixed some compiler warnings include/maria.h: Added maria_disable_logging() and maria_enable_logging() mysql-test/include/maria_verify_recovery.inc: Updated tests now when key redo/undo works mysql-test/r/maria-recovery.result: Updated tests now when key redo/undo works storage/maria/ma_blockrec.c: Use unified CLR code Added rec_lsn for full pages Moved clr write hook to ma_key_recover.c Changed REDO code to keep pages pinned until undo Mark page_link's as changed storage/maria/ma_blockrec.h: Moved write_hook_for_clr_end() to ma_key_recover.c storage/maria/ma_check.c: Changed key check code to use PAGECACHE_READ_UNKNOWN_PAGE Fixed wrong warning when checking files after maria_pack When unpacking files, we have to use new keypos_to_recpos method When doing repair, we can disregard index key file pages in page cache storage/maria/ma_commit.c: Added simple enable/disable logging functions (Needed for recovery) storage/maria/ma_control_file.c: Make maria control file extendable without having to make it incompatible for older versions storage/maria/ma_control_file.h: New error messages Added CONTROL_FILE_VERSION storage/maria/ma_delete.c: Added redo/undo for key pages change_length -> changed_length to make things similar More comments & more DBUG storage/maria/ma_key_recover.c: Unified CLR method Moved here write_hook_for_clr_end() and common keypage log functions Changed REDO to keep pages pinned until undo Changed UNDO code to change key_root under log mutex storage/maria/ma_key_recover.h: New structures and functions storage/maria/ma_loghandler.c: Include needed files storage/maria/ma_open.c: Change maria_open() to use pread() instead of read() storage/maria/ma_page.c: Fixed bug in key_del handling Clear pages if IDENTICAL_PAGES_AFTER_RECOVERY is defined storage/maria/ma_pagecache.c: Indentation and spelling fixes More DBUG Added helper function: pagecache_block_link_to_buffer() storage/maria/ma_pagecache.h: Added pagecache_block_link_to_buffer() storage/maria/ma_recovery.c: Fixed state.changed Fixed that REDO keeps pages pinned until UNDO Some bug fixes from previous commit Fixes for UNDO/REDO of key pages storage/maria/ma_search.c: Fixed packing and storing of keys to provide more information to caller so that we can do efficent REDO logging of the changes. storage/maria/ma_test1.c: Fixed bug with not initialized variable storage/maria/ma_test2.c: Removed not used code storage/maria/ma_test_all.res: Updated results storage/maria/ma_test_all.sh: Changed one test to test more Removed timing tests as not relevant here storage/maria/ma_test_recovery.expected: Updated test result after redo/undo if key pages works storage/maria/ma_test_recovery: Updated test after redo/undo if key pages works storage/maria/ma_write.c: Moved some general log functions to ma_key_recover.c Fixed some bugs in undo Moved ma_log_split() to _ma_split_page() Small changes in some function arguments to be able to do redo logging storage/maria/maria_chk.c: disable logging while doing repair table storage/maria/maria_def.h: New function prototypes Move some structs and functions to ma_key_recover.c storage/maria/unittest/ma_control_file-t.c: Updated with patch from Sanja NOTE: This is not complete and need to be updated to new control file format storage/maria/unittest/ma_test_loghandler-t.c: Fixed compiler warning
2007-11-20 16:42:16 +01:00
}
share->state.state.checksum+= ha_checksum_korr(buff);
}
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
share->state.changed|= STATE_CHANGED | STATE_NOT_ANALYZED;
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
}
Fixes for redo/undo logging of key pages New extendable format for maria_log_control file Fixed some compiler warnings include/maria.h: Added maria_disable_logging() and maria_enable_logging() mysql-test/include/maria_verify_recovery.inc: Updated tests now when key redo/undo works mysql-test/r/maria-recovery.result: Updated tests now when key redo/undo works storage/maria/ma_blockrec.c: Use unified CLR code Added rec_lsn for full pages Moved clr write hook to ma_key_recover.c Changed REDO code to keep pages pinned until undo Mark page_link's as changed storage/maria/ma_blockrec.h: Moved write_hook_for_clr_end() to ma_key_recover.c storage/maria/ma_check.c: Changed key check code to use PAGECACHE_READ_UNKNOWN_PAGE Fixed wrong warning when checking files after maria_pack When unpacking files, we have to use new keypos_to_recpos method When doing repair, we can disregard index key file pages in page cache storage/maria/ma_commit.c: Added simple enable/disable logging functions (Needed for recovery) storage/maria/ma_control_file.c: Make maria control file extendable without having to make it incompatible for older versions storage/maria/ma_control_file.h: New error messages Added CONTROL_FILE_VERSION storage/maria/ma_delete.c: Added redo/undo for key pages change_length -> changed_length to make things similar More comments & more DBUG storage/maria/ma_key_recover.c: Unified CLR method Moved here write_hook_for_clr_end() and common keypage log functions Changed REDO to keep pages pinned until undo Changed UNDO code to change key_root under log mutex storage/maria/ma_key_recover.h: New structures and functions storage/maria/ma_loghandler.c: Include needed files storage/maria/ma_open.c: Change maria_open() to use pread() instead of read() storage/maria/ma_page.c: Fixed bug in key_del handling Clear pages if IDENTICAL_PAGES_AFTER_RECOVERY is defined storage/maria/ma_pagecache.c: Indentation and spelling fixes More DBUG Added helper function: pagecache_block_link_to_buffer() storage/maria/ma_pagecache.h: Added pagecache_block_link_to_buffer() storage/maria/ma_recovery.c: Fixed state.changed Fixed that REDO keeps pages pinned until UNDO Some bug fixes from previous commit Fixes for UNDO/REDO of key pages storage/maria/ma_search.c: Fixed packing and storing of keys to provide more information to caller so that we can do efficent REDO logging of the changes. storage/maria/ma_test1.c: Fixed bug with not initialized variable storage/maria/ma_test2.c: Removed not used code storage/maria/ma_test_all.res: Updated results storage/maria/ma_test_all.sh: Changed one test to test more Removed timing tests as not relevant here storage/maria/ma_test_recovery.expected: Updated test result after redo/undo if key pages works storage/maria/ma_test_recovery: Updated test after redo/undo if key pages works storage/maria/ma_write.c: Moved some general log functions to ma_key_recover.c Fixed some bugs in undo Moved ma_log_split() to _ma_split_page() Small changes in some function arguments to be able to do redo logging storage/maria/maria_chk.c: disable logging while doing repair table storage/maria/maria_def.h: New function prototypes Move some structs and functions to ma_key_recover.c storage/maria/unittest/ma_control_file-t.c: Updated with patch from Sanja NOTE: This is not complete and need to be updated to new control file format storage/maria/unittest/ma_test_loghandler-t.c: Fixed compiler warning
2007-11-20 16:42:16 +01:00
if (row_entry)
tprint(tracef, " rows' count %lu\n", (ulong)share->state.state.records);
* WL#4137 Maria- Framework for testing recovery in mysql-test-run See test maria-recovery.test for a model; all include scripts have an "API" section at start if they do take parameters from outside. * Fixing bug reported by Jani and Monty (when two REDOs about the same page in one group, see ma_blockrec.c). * Fixing small bugs in recovery mysql-test/include/wait_until_connected_again.inc: be sure to enter the loop (the previous query by the caller may not have failed: it could be query; mysqladmin shutdown; call this script). mysql-test/lib/mtr_process.pl: * Through the "expect" file a test can tell mtr that a server crash is expected. What the file contains is irrelevant. Now if its last line starts with "wait", mtr will wait before restarting (it will wait for the last line to not start with "wait"). This is for tests which need to mangle files under the feet of a dead mysqld. * Remove "expect" file before restarting; otherwise there could be a race condition: tests sees server restarted, does something, writes an "expect" file, and then mtr removes that file, then test kills mysqld, and then mtr will never restart it. storage/maria/ma_blockrec.c: - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - fixing bug in applying of REDO_PURGE_BLOCKS in recovery: page_range sometimes has TAIL_BIT set, need to turn it down to know the real page range. - Both bugs are covered in maria-recovery.test storage/maria/ma_checkpoint.c: Capability to, in debug builds only, do some special operations (flush all bitmap and data pages, flush state, flush log) and crash mysqld, to later test recovery. Driven by some --debug=d, symbols. storage/maria/ma_open.c: debugging info storage/maria/ma_pagecache.c: Now that we can _ma_unpin_all_pages() during the REDO phase to set page's LSN, the assertion needs to be relaxed. storage/maria/ma_recovery.c: - open trace file in append mode (useful when a test triggers several recoveries, we see them all). - fixing wrong error detection, it's possible that during recovery we want to open an already open table. - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - we verify that all log records of a group are about the same table, for debugging. mysql-test/r/maria-recovery.result: result mysql-test/t/maria-recovery-master.opt: crash is expected, core file would take room, stack trace would wake pushbuild up. mysql-test/t/maria-recovery.test: Test of recovery from mysql-test (it is already tested as unit tests in ma_test_recovery) (WL#4137) - test that, if recovery is made to start on an empty table it can replay the effects of committed and uncommitted statements (having only the committed ones in the end result). This should be the first test for someone writing code of new REDOs. - test that, if mysqld is crashed and recovery runs we have only committed statements in the end result. Crashes are done in different ways: flush nothing (so, uncommitted statement is often missing from the log => no rollback to do); flush pagecache (implicitely flushes log (WAL)) and flush log, both causes rollbacks; flush log can also flush state (state.records etc) to test recovery of the state (not tested well now as we repair the index anyway). - test of bug found by Jani and Monty in recovery (two REDO about the same page in one group). mysql-test/include/maria_empty_logs.inc: removes logs, to have a clean sheet for testing recovery. mysql-test/include/maria_make_snapshot.inc: copies a table to another directory, or back, or compares both (comparison is not implemented as physical comparison is impossible if an UNDO phase happened). mysql-test/include/maria_make_snapshot_for_comparison.inc: copies tables to another directory so that they can later serve as a comparison reference (they are the good tables, recovery should produce similar ones). mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: When we want to force recovery to start on old tables, we prepare old tables with this script: we put them in a spare directory. They are later copied back over mysqltest tables while mysqld is dead. We also need to copy back the control file, otherwise mysqld, in recovery, would start from the latest checkpoint: latest checkpoint plus old tables is not a recovery-possible scenario of course. mysql-test/include/maria_verify_recovery.inc: causes mysqld to crash, restores old tables if requested, lets recovery run, compares resulting tables with reference tables by using CHECKSUM TABLE. We don't do any sanity checks on page's LSN in resulting tables, yet.
2007-11-13 17:12:29 +01:00
_ma_unpin_all_pages(info, rec->lsn);
Fixed repair_by_sort to work with BLOCK_RECORD Fixed bugs in undo logging Fixed bug where head block was split before min_row_length (caused Maria to believe row was crashed on read) Reserved place for reference-transid on key pages (for packing of transids) ALTER TABLE and INSERT ... SELECT now uses fast creation of index Known bugs: ma_test_recovery fails because of a bug in redo handling when log is cut directly after a redo (Guilhem knows how to fix) ma_test_recovery.excepted is not totally correct, because of the above bug mysqld sometimes fails to restart; Fails with error "end_of_redo_phase: Assertion `long_trid != 0' failed"; Guilhem to investigate include/maria.h: Prototype changes Added current_filepos to st_maria_sort_info mysql-test/r/maria.result: Updated results that changes as alter table and insert ... select now uses fast creation of index mysys/mf_iocache.c: Reset variable to gurard against double invocation storage/maria/ma_bitmap.c: Added _ma_bitmap_reset_cache() (needed for repair) storage/maria/ma_blockrec.c: Simplify code More initial allocations Fixed bug where head block was split before min_row_length (caused Maria to believe row was crashed on read) storage/maria/ma_blockrec.h: Moved TRANSID_SIZE to maria_def.h Added prototype for new functions storage/maria/ma_check.c: Simplicy code Fixed repair_by_sort to work with BLOCK_RECORD - When using BLOCK_RECORD or UNPACK create new Maria handle - Use common initializer function - Align code with maria_repair() Made some changes to maria_repair_parallel() to use common initializer function Removed ASK_MONTY section by fixing noted problem storage/maria/ma_close.c: Moved check for readonly to _ma_state_info_write() storage/maria/ma_key_recover.c: Use different log entries if key root changes or not. This fixed some bugs when tree grows storage/maria/ma_key_recover.h: Added keynr to st_msg_to_write_hook_for_undo_key storage/maria/ma_loghandler.c: Added INIT_LOGREC_UNDO_KEY_INSERT_WITH_ROOT storage/maria/ma_loghandler.h: Added INIT_LOGREC_UNDO_KEY_INSERT_WITH_ROOT storage/maria/ma_open.c: Added TRANSID to all key pages (for future compressing of trans id's) For compressed records, alloc a bit bigger buffer to avoid valgrind warnings If table is opened readonly, don't update state storage/maria/ma_packrec.c: Allocate bigger array for bit unpacking to avoid valgrind errors storage/maria/ma_recovery.c: Added UNDO_KEY_INSERT_WITH_ROOT & UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_sort.c: More logging storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Update results Note that this is not complete becasue of a bug in recovery storage/maria/ma_test_recovery: Removed recreation of index (not needed when we have redo for index pages) storage/maria/maria_chk.c: When using flag --read-only, don't update status for files When using --unpack, don't use REPAIR_BY_SORT if other repair option is given Enable repair_by_sort for BLOCK records Removed not needed newline at start of --describe storage/maria/maria_def.h: Support for TRANSID_SIZE to key pages storage/maria/maria_read_log.c: renamed --only-display to --display-only
2007-11-28 20:38:30 +01:00
DBUG_RETURN(0);
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
}
prototype_undo_exec_hook(UNDO_ROW_INSERT)
{
my_bool error;
MARIA_HA *info= get_MARIA_HA_from_UNDO_record(rec);
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
LSN previous_undo_lsn= lsn_korr(rec->header);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
MARIA_SHARE *share;
const uchar *record_ptr;
if (info == NULL)
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
{
/*
Unlike for REDOs, if the table was skipped it is abnormal; we have a
transaction to rollback which used this table, as it is not rolled back
it was supposed to hold this table and so the table should still be
there.
*/
return 1;
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
}
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
share= info->s;
share->state.changed|= STATE_CHANGED | STATE_NOT_ANALYZED;
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
record_ptr= rec->header;
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
if (share->calc_checksum)
{
/*
We need to read more of the record to put the checksum into the record
buffer used by _ma_apply_undo_row_insert().
If the table has no live checksum, rec->header will be enough.
*/
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to read record\n");
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
return 1;
}
record_ptr= log_record_buffer.str;
}
info->trn= trn;
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
error= _ma_apply_undo_row_insert(info, previous_undo_lsn,
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
record_ptr + LSN_STORE_SIZE +
FILEID_STORE_SIZE);
info->trn= 0;
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
/* trn->undo_lsn is updated in an inwrite_hook when writing the CLR_END */
tprint(tracef, " rows' count %lu\n", (ulong)info->s->state.state.records);
tprint(tracef, " undo_lsn now LSN (%lu,0x%lx)\n",
LSN_IN_PARTS(previous_undo_lsn));
return error;
}
prototype_undo_exec_hook(UNDO_ROW_DELETE)
{
my_bool error;
MARIA_HA *info= get_MARIA_HA_from_UNDO_record(rec);
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
LSN previous_undo_lsn= lsn_korr(rec->header);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
MARIA_SHARE *share;
if (info == NULL)
return 1;
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
share= info->s;
share->state.changed|= STATE_CHANGED | STATE_NOT_ANALYZED;
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to read record\n");
return 1;
}
info->trn= trn;
/*
For now we skip the page and directory entry. This is to be used
later when we mark rows as deleted.
*/
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
error= _ma_apply_undo_row_delete(info, previous_undo_lsn,
log_record_buffer.str + LSN_STORE_SIZE +
FILEID_STORE_SIZE + PAGE_STORE_SIZE +
DIRPOS_STORE_SIZE,
rec->record_length -
(LSN_STORE_SIZE + FILEID_STORE_SIZE +
PAGE_STORE_SIZE + DIRPOS_STORE_SIZE));
info->trn= 0;
tprint(tracef, " rows' count %lu\n undo_lsn now LSN (%lu,0x%lx)\n",
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
(ulong)share->state.state.records, LSN_IN_PARTS(previous_undo_lsn));
return error;
}
prototype_undo_exec_hook(UNDO_ROW_UPDATE)
{
my_bool error;
MARIA_HA *info= get_MARIA_HA_from_UNDO_record(rec);
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
LSN previous_undo_lsn= lsn_korr(rec->header);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
MARIA_SHARE *share;
if (info == NULL)
return 1;
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
share= info->s;
share->state.changed|= STATE_CHANGED | STATE_NOT_ANALYZED;
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to read record\n");
return 1;
}
info->trn= trn;
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
error= _ma_apply_undo_row_update(info, previous_undo_lsn,
log_record_buffer.str + LSN_STORE_SIZE +
Added applying of undo for updates Fixed bug in duplicate key handling for block records during repair All read-row methods now return error number in case of error Don't calculate checksum for null fields Fixed bug when running maria_read_log with -o BUILD/SETUP.sh: Added STACK_DIRECTION BUILD/compile-pentium-debug-max: Moved STACK_DIRECTION to SETUP include/myisam.h: Added extra parameter to write_key storage/maria/ma_blockrec.c: Added applying of undo for updates Fixed indentation Removed some not needed casts Fixed wrong logging of CLR record Split ma_update_block_record to two functions to be able to reuse it from undo-applying Simplify filling of packed fields ma_record_block_record) now returns error number on failure Sligtly changed log record information for undo-update storage/maria/ma_check.c: Fixed bug in duplicate key handling for block records during repair storage/maria/ma_checksum.c: Don't calculate checksum for null fields storage/maria/ma_dynrec.c: _ma_read_dynamic_reocrd() now returns error number on error Rest of the changes are code simplification and indentation fixes storage/maria/ma_locking.c: Added comment storage/maria/ma_loghandler.c: More debugging Removed printing of total_record_length as this was always same as record_length storage/maria/ma_open.c: Allocate bitmap for changed fields storage/maria/ma_packrec.c: read_record now returns error number on error storage/maria/ma_recovery.c: Fixed wrong arguments to undo_row_update storage/maria/ma_statrec.c: read_record now returns error number on error (not 1) Code simplification storage/maria/ma_test1.c: Added exit possibility after update phase (to test undo of updates) storage/maria/maria_def.h: Include bitmap header file storage/maria/maria_read_log.c: Fixed bug when running with -o
2007-09-09 18:15:10 +02:00
FILEID_STORE_SIZE,
rec->record_length -
Added applying of undo for updates Fixed bug in duplicate key handling for block records during repair All read-row methods now return error number in case of error Don't calculate checksum for null fields Fixed bug when running maria_read_log with -o BUILD/SETUP.sh: Added STACK_DIRECTION BUILD/compile-pentium-debug-max: Moved STACK_DIRECTION to SETUP include/myisam.h: Added extra parameter to write_key storage/maria/ma_blockrec.c: Added applying of undo for updates Fixed indentation Removed some not needed casts Fixed wrong logging of CLR record Split ma_update_block_record to two functions to be able to reuse it from undo-applying Simplify filling of packed fields ma_record_block_record) now returns error number on failure Sligtly changed log record information for undo-update storage/maria/ma_check.c: Fixed bug in duplicate key handling for block records during repair storage/maria/ma_checksum.c: Don't calculate checksum for null fields storage/maria/ma_dynrec.c: _ma_read_dynamic_reocrd() now returns error number on error Rest of the changes are code simplification and indentation fixes storage/maria/ma_locking.c: Added comment storage/maria/ma_loghandler.c: More debugging Removed printing of total_record_length as this was always same as record_length storage/maria/ma_open.c: Allocate bitmap for changed fields storage/maria/ma_packrec.c: read_record now returns error number on error storage/maria/ma_recovery.c: Fixed wrong arguments to undo_row_update storage/maria/ma_statrec.c: read_record now returns error number on error (not 1) Code simplification storage/maria/ma_test1.c: Added exit possibility after update phase (to test undo of updates) storage/maria/maria_def.h: Include bitmap header file storage/maria/maria_read_log.c: Fixed bug when running with -o
2007-09-09 18:15:10 +02:00
(LSN_STORE_SIZE + FILEID_STORE_SIZE));
info->trn= 0;
tprint(tracef, " undo_lsn now LSN (%lu,0x%lx)\n",
LSN_IN_PARTS(previous_undo_lsn));
return error;
}
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
prototype_undo_exec_hook(UNDO_KEY_INSERT)
{
my_bool error;
MARIA_HA *info= get_MARIA_HA_from_UNDO_record(rec);
LSN previous_undo_lsn= lsn_korr(rec->header);
MARIA_SHARE *share;
if (info == NULL)
{
/*
Unlike for REDOs, if the table was skipped it is abnormal; we have a
transaction to rollback which used this table, as it is not rolled back
it was supposed to hold this table and so the table should still be
there.
*/
return 1;
}
share= info->s;
share->state.changed|= STATE_CHANGED | STATE_NOT_ANALYZED;
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to read record\n");
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
return 1;
}
info->trn= trn;
error= _ma_apply_undo_key_insert(info, previous_undo_lsn,
log_record_buffer.str + LSN_STORE_SIZE +
FILEID_STORE_SIZE,
rec->record_length - LSN_STORE_SIZE -
FILEID_STORE_SIZE);
info->trn= 0;
/* trn->undo_lsn is updated in an inwrite_hook when writing the CLR_END */
tprint(tracef, " undo_lsn now LSN (%lu,0x%lx)\n",
LSN_IN_PARTS(previous_undo_lsn));
return error;
}
prototype_undo_exec_hook(UNDO_KEY_DELETE)
{
my_bool error;
MARIA_HA *info= get_MARIA_HA_from_UNDO_record(rec);
LSN previous_undo_lsn= lsn_korr(rec->header);
MARIA_SHARE *share;
if (info == NULL)
{
/*
Unlike for REDOs, if the table was skipped it is abnormal; we have a
transaction to rollback which used this table, as it is not rolled back
it was supposed to hold this table and so the table should still be
there.
*/
return 1;
}
share= info->s;
share->state.changed|= STATE_CHANGED | STATE_NOT_ANALYZED;
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to read record\n");
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
return 1;
}
info->trn= trn;
error= _ma_apply_undo_key_delete(info, previous_undo_lsn,
log_record_buffer.str + LSN_STORE_SIZE +
FILEID_STORE_SIZE,
rec->record_length - LSN_STORE_SIZE -
FILEID_STORE_SIZE);
info->trn= 0;
/* trn->undo_lsn is updated in an inwrite_hook when writing the CLR_END */
tprint(tracef, " undo_lsn now LSN (%lu,0x%lx)\n",
LSN_IN_PARTS(previous_undo_lsn));
return error;
}
prototype_undo_exec_hook(UNDO_KEY_DELETE_WITH_ROOT)
{
my_bool error;
MARIA_HA *info= get_MARIA_HA_from_UNDO_record(rec);
LSN previous_undo_lsn= lsn_korr(rec->header);
MARIA_SHARE *share;
if (info == NULL)
{
/*
Unlike for REDOs, if the table was skipped it is abnormal; we have a
transaction to rollback which used this table, as it is not rolled back
it was supposed to hold this table and so the table should still be
there.
*/
return 1;
}
share= info->s;
share->state.changed|= STATE_CHANGED | STATE_NOT_ANALYZED;
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to read record\n");
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
return 1;
}
info->trn= trn;
error= _ma_apply_undo_key_delete(info, previous_undo_lsn,
log_record_buffer.str + LSN_STORE_SIZE +
FILEID_STORE_SIZE + PAGE_STORE_SIZE,
rec->record_length - LSN_STORE_SIZE -
FILEID_STORE_SIZE - PAGE_STORE_SIZE);
info->trn= 0;
/* trn->undo_lsn is updated in an inwrite_hook when writing the CLR_END */
tprint(tracef, " undo_lsn now LSN (%lu,0x%lx)\n",
LSN_IN_PARTS(previous_undo_lsn));
return error;
}
static int run_redo_phase(LSN lsn, enum maria_apply_log_way apply)
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
{
Remove SAFE_MODE for opt_range as it disables UPDATE to use keys REDO optimization (Bascily avoid moving blocks from/to pagecache) More command line arguments to maria_read_log Fixed recovery bug when recreating table sql/opt_range.cc: Remove SAFE_MODE for opt_range as it disables UPDATE to use keys storage/maria/ma_blockrec.c: REDO optimization Use new interface for pagecache_reads to avoid copying page buffers storage/maria/ma_loghandler.c: Patch from Sanja: - Added new parameter to translog_get_page to use direct links to pagecache - Changed scanner to be able to use direct links This avoids a lot of calls to bmove512() in page cache. storage/maria/ma_loghandler.h: Added direct link to pagecache objects storage/maria/ma_open.c: Added const to parameter Added missing braces storage/maria/ma_pagecache.c: From Sanja: - Added direct links to pagecache (from pagecache_read()) Dirrect link means that on pagecache_read we get back a pointer to the pagecache buffer From Monty: - Fixed arguments to init_page_cache to handle big page caches - Fixed compiler warnings - Replaced PAGECACHE_PAGE_LINK with PAGECACHE_BLOCK_LINK * to catch errors storage/maria/ma_pagecache.h: Changed block numbers from int to long to be able to handle big page caches Changed some PAGECACHE_PAGE_LINK to PAGECACHE_BLOCK_LINK storage/maria/ma_recovery.c: Fixed recovery bug when recreating table (table was kept open) Moved some variables to function start (portability) Added space to some print messages storage/maria/maria_chk.c: key_buffer_size -> page_buffer_size storage/maria/maria_def.h: Changed default page_buffer_size to 10M storage/maria/maria_read_log.c: Added more startup options: --version --undo (apply undo) --page_cache_size (to run with big cache sizes) --silent (to not get any output from --apply) storage/maria/unittest/ma_control_file-t.c: Fixed compiler warning storage/maria/unittest/ma_test_loghandler-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Added new argument to translog_init_scanner()
2007-09-27 13:18:28 +02:00
TRANSLOG_HEADER_BUFFER rec;
struct st_translog_scanner_data scanner;
int len;
uint i;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
/* install hooks for execution */
#define install_redo_exec_hook(R) \
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
log_record_type_descriptor[LOGREC_ ## R].record_execute_in_redo_phase= \
exec_REDO_LOGREC_ ## R;
#define install_undo_exec_hook(R) \
log_record_type_descriptor[LOGREC_ ## R].record_execute_in_undo_phase= \
exec_UNDO_LOGREC_ ## R;
install_redo_exec_hook(LONG_TRANSACTION_ID);
install_redo_exec_hook(CHECKPOINT);
install_redo_exec_hook(REDO_CREATE_TABLE);
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
install_redo_exec_hook(REDO_RENAME_TABLE);
install_redo_exec_hook(REDO_REPAIR_TABLE);
install_redo_exec_hook(REDO_DROP_TABLE);
install_redo_exec_hook(FILE_ID);
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
install_redo_exec_hook(INCOMPLETE_LOG);
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
install_redo_exec_hook(INCOMPLETE_GROUP);
install_redo_exec_hook(REDO_INSERT_ROW_HEAD);
install_redo_exec_hook(REDO_INSERT_ROW_TAIL);
Merge some changes from sql directory in 5.1 tree Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added redo_free_head_or_tail() & redo_insert_row_blobs() Added uuid to control file maria_checks now verifies that not used part of bitmap is 0 REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL Fixes problem when trying to read block outside of file during REDO include/my_global.h: STACK_DIRECTION is already set by configure mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Test shrinking of VARCHAR mysys/my_realloc.c: Fixed indentation mysys/safemalloc.c: Fixed indentation sql/filesort.cc: Removed some casts sql/mysqld.cc: Added missing setting of myisam_stats_method_str sql/uniques.cc: Removed some casts storage/maria/ma_bitmap.c: Added printing of bitmap (for debugging) Renamed _ma_print_bitmap() -> _ma_print_bitmap_changes() Added _ma_set_full_page_bits() Fixed bug in ma_bitmap_find_new_place() (affecting updates) when using big files storage/maria/ma_blockrec.c: Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added code to fix some cases where redo when using blobs didn't produce idenital .MAD files as normal usage REDO_FREE_ROW_BLOCKS doesn't anymore change pages; We only mark things free in bitmap Remove TAIL and filler extents from REDO_FREE_BLOCKS log entry. (Fixed some asserts) REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Delete tails in update. (Fixed bug when doing update that shrinks blob/varchar length) Fixed bug when doing insert in block outside of file size. Added redo_free_head_or_tail() & redo_insert_row_blobs() Added pagecache_unlock_by_link() when read fails. Much more comments, DBUG and ASSERT entries storage/maria/ma_blockrec.h: Prototypes of new functions Define of SUB_RANGE_SIZE & BLOCK_FILLER_SIZE storage/maria/ma_check.c: Verify that not used part of bitmap is 0 storage/maria/ma_control_file.c: Added uuid to control file storage/maria/ma_loghandler.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_loghandler.h: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_pagecache.c: If we write full block, remove error flag for block. (Fixes problem when trying to read block outside of file) storage/maria/ma_recovery.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_test1.c: Allow option after 'b' to be compatible with ma_test2 (This is just to simplify test scripts like ma_test_recovery) storage/maria/ma_test2.c: Default size of blob is now 1000 instead of 1 storage/maria/ma_test_all.sh: Added test for bigger blobs storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Added test for bigger blobs
2007-10-19 23:24:22 +02:00
install_redo_exec_hook(REDO_INSERT_ROW_BLOBS);
install_redo_exec_hook(REDO_PURGE_ROW_HEAD);
install_redo_exec_hook(REDO_PURGE_ROW_TAIL);
Merge some changes from sql directory in 5.1 tree Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added redo_free_head_or_tail() & redo_insert_row_blobs() Added uuid to control file maria_checks now verifies that not used part of bitmap is 0 REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL Fixes problem when trying to read block outside of file during REDO include/my_global.h: STACK_DIRECTION is already set by configure mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Test shrinking of VARCHAR mysys/my_realloc.c: Fixed indentation mysys/safemalloc.c: Fixed indentation sql/filesort.cc: Removed some casts sql/mysqld.cc: Added missing setting of myisam_stats_method_str sql/uniques.cc: Removed some casts storage/maria/ma_bitmap.c: Added printing of bitmap (for debugging) Renamed _ma_print_bitmap() -> _ma_print_bitmap_changes() Added _ma_set_full_page_bits() Fixed bug in ma_bitmap_find_new_place() (affecting updates) when using big files storage/maria/ma_blockrec.c: Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added code to fix some cases where redo when using blobs didn't produce idenital .MAD files as normal usage REDO_FREE_ROW_BLOCKS doesn't anymore change pages; We only mark things free in bitmap Remove TAIL and filler extents from REDO_FREE_BLOCKS log entry. (Fixed some asserts) REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Delete tails in update. (Fixed bug when doing update that shrinks blob/varchar length) Fixed bug when doing insert in block outside of file size. Added redo_free_head_or_tail() & redo_insert_row_blobs() Added pagecache_unlock_by_link() when read fails. Much more comments, DBUG and ASSERT entries storage/maria/ma_blockrec.h: Prototypes of new functions Define of SUB_RANGE_SIZE & BLOCK_FILLER_SIZE storage/maria/ma_check.c: Verify that not used part of bitmap is 0 storage/maria/ma_control_file.c: Added uuid to control file storage/maria/ma_loghandler.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_loghandler.h: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_pagecache.c: If we write full block, remove error flag for block. (Fixes problem when trying to read block outside of file) storage/maria/ma_recovery.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_test1.c: Allow option after 'b' to be compatible with ma_test2 (This is just to simplify test scripts like ma_test_recovery) storage/maria/ma_test2.c: Default size of blob is now 1000 instead of 1 storage/maria/ma_test_all.sh: Added test for bigger blobs storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Added test for bigger blobs
2007-10-19 23:24:22 +02:00
install_redo_exec_hook(REDO_FREE_HEAD_OR_TAIL);
install_redo_exec_hook(REDO_FREE_BLOCKS);
install_redo_exec_hook(REDO_DELETE_ALL);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
install_redo_exec_hook(REDO_INDEX);
install_redo_exec_hook(REDO_INDEX_NEW_PAGE);
install_redo_exec_hook(REDO_INDEX_FREE_PAGE);
install_redo_exec_hook(UNDO_ROW_INSERT);
install_redo_exec_hook(UNDO_ROW_DELETE);
install_redo_exec_hook(UNDO_ROW_UPDATE);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
install_redo_exec_hook(UNDO_KEY_INSERT);
install_redo_exec_hook(UNDO_KEY_DELETE);
install_redo_exec_hook(UNDO_KEY_DELETE_WITH_ROOT);
install_redo_exec_hook(COMMIT);
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
install_redo_exec_hook(CLR_END);
install_undo_exec_hook(UNDO_ROW_INSERT);
install_undo_exec_hook(UNDO_ROW_DELETE);
install_undo_exec_hook(UNDO_ROW_UPDATE);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
install_undo_exec_hook(UNDO_KEY_INSERT);
install_undo_exec_hook(UNDO_KEY_DELETE);
install_undo_exec_hook(UNDO_KEY_DELETE_WITH_ROOT);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
current_group_end_lsn= LSN_IMPOSSIBLE;
* WL#4137 Maria- Framework for testing recovery in mysql-test-run See test maria-recovery.test for a model; all include scripts have an "API" section at start if they do take parameters from outside. * Fixing bug reported by Jani and Monty (when two REDOs about the same page in one group, see ma_blockrec.c). * Fixing small bugs in recovery mysql-test/include/wait_until_connected_again.inc: be sure to enter the loop (the previous query by the caller may not have failed: it could be query; mysqladmin shutdown; call this script). mysql-test/lib/mtr_process.pl: * Through the "expect" file a test can tell mtr that a server crash is expected. What the file contains is irrelevant. Now if its last line starts with "wait", mtr will wait before restarting (it will wait for the last line to not start with "wait"). This is for tests which need to mangle files under the feet of a dead mysqld. * Remove "expect" file before restarting; otherwise there could be a race condition: tests sees server restarted, does something, writes an "expect" file, and then mtr removes that file, then test kills mysqld, and then mtr will never restart it. storage/maria/ma_blockrec.c: - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - fixing bug in applying of REDO_PURGE_BLOCKS in recovery: page_range sometimes has TAIL_BIT set, need to turn it down to know the real page range. - Both bugs are covered in maria-recovery.test storage/maria/ma_checkpoint.c: Capability to, in debug builds only, do some special operations (flush all bitmap and data pages, flush state, flush log) and crash mysqld, to later test recovery. Driven by some --debug=d, symbols. storage/maria/ma_open.c: debugging info storage/maria/ma_pagecache.c: Now that we can _ma_unpin_all_pages() during the REDO phase to set page's LSN, the assertion needs to be relaxed. storage/maria/ma_recovery.c: - open trace file in append mode (useful when a test triggers several recoveries, we see them all). - fixing wrong error detection, it's possible that during recovery we want to open an already open table. - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - we verify that all log records of a group are about the same table, for debugging. mysql-test/r/maria-recovery.result: result mysql-test/t/maria-recovery-master.opt: crash is expected, core file would take room, stack trace would wake pushbuild up. mysql-test/t/maria-recovery.test: Test of recovery from mysql-test (it is already tested as unit tests in ma_test_recovery) (WL#4137) - test that, if recovery is made to start on an empty table it can replay the effects of committed and uncommitted statements (having only the committed ones in the end result). This should be the first test for someone writing code of new REDOs. - test that, if mysqld is crashed and recovery runs we have only committed statements in the end result. Crashes are done in different ways: flush nothing (so, uncommitted statement is often missing from the log => no rollback to do); flush pagecache (implicitely flushes log (WAL)) and flush log, both causes rollbacks; flush log can also flush state (state.records etc) to test recovery of the state (not tested well now as we repair the index anyway). - test of bug found by Jani and Monty in recovery (two REDO about the same page in one group). mysql-test/include/maria_empty_logs.inc: removes logs, to have a clean sheet for testing recovery. mysql-test/include/maria_make_snapshot.inc: copies a table to another directory, or back, or compares both (comparison is not implemented as physical comparison is impossible if an UNDO phase happened). mysql-test/include/maria_make_snapshot_for_comparison.inc: copies tables to another directory so that they can later serve as a comparison reference (they are the good tables, recovery should produce similar ones). mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: When we want to force recovery to start on old tables, we prepare old tables with this script: we put them in a spare directory. They are later copied back over mysqltest tables while mysqld is dead. We also need to copy back the control file, otherwise mysqld, in recovery, would start from the latest checkpoint: latest checkpoint plus old tables is not a recovery-possible scenario of course. mysql-test/include/maria_verify_recovery.inc: causes mysqld to crash, restores old tables if requested, lets recovery run, compares resulting tables with reference tables by using CHECKSUM TABLE. We don't do any sanity checks on page's LSN in resulting tables, yet.
2007-11-13 17:12:29 +01:00
#ifndef DBUG_OFF
current_group_table= NULL;
#endif
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
if (unlikely(lsn == LSN_IMPOSSIBLE || lsn == translog_get_horizon()))
{
tprint(tracef, "checkpoint address refers to the log end log or "
"log is empty, nothing to do.\n");
return 0;
}
Remove SAFE_MODE for opt_range as it disables UPDATE to use keys REDO optimization (Bascily avoid moving blocks from/to pagecache) More command line arguments to maria_read_log Fixed recovery bug when recreating table sql/opt_range.cc: Remove SAFE_MODE for opt_range as it disables UPDATE to use keys storage/maria/ma_blockrec.c: REDO optimization Use new interface for pagecache_reads to avoid copying page buffers storage/maria/ma_loghandler.c: Patch from Sanja: - Added new parameter to translog_get_page to use direct links to pagecache - Changed scanner to be able to use direct links This avoids a lot of calls to bmove512() in page cache. storage/maria/ma_loghandler.h: Added direct link to pagecache objects storage/maria/ma_open.c: Added const to parameter Added missing braces storage/maria/ma_pagecache.c: From Sanja: - Added direct links to pagecache (from pagecache_read()) Dirrect link means that on pagecache_read we get back a pointer to the pagecache buffer From Monty: - Fixed arguments to init_page_cache to handle big page caches - Fixed compiler warnings - Replaced PAGECACHE_PAGE_LINK with PAGECACHE_BLOCK_LINK * to catch errors storage/maria/ma_pagecache.h: Changed block numbers from int to long to be able to handle big page caches Changed some PAGECACHE_PAGE_LINK to PAGECACHE_BLOCK_LINK storage/maria/ma_recovery.c: Fixed recovery bug when recreating table (table was kept open) Moved some variables to function start (portability) Added space to some print messages storage/maria/maria_chk.c: key_buffer_size -> page_buffer_size storage/maria/maria_def.h: Changed default page_buffer_size to 10M storage/maria/maria_read_log.c: Added more startup options: --version --undo (apply undo) --page_cache_size (to run with big cache sizes) --silent (to not get any output from --apply) storage/maria/unittest/ma_control_file-t.c: Fixed compiler warning storage/maria/unittest/ma_test_loghandler-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Added new argument to translog_init_scanner()
2007-09-27 13:18:28 +02:00
len= translog_read_record_header(lsn, &rec);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
if (len == RECHEADER_READ_ERROR)
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to read header of the first record.\n");
return 1;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
if (translog_scanner_init(lsn, 1, &scanner, 1))
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
{
tprint(tracef, "Scanner init failed\n");
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
return 1;
}
for (i= 1;;i++)
{
uint16 sid= rec.short_trid;
const LOG_DESC *log_desc= &log_record_type_descriptor[rec.type];
display_record_position(log_desc, &rec, i);
/*
A complete group is a set of log records with an "end mark" record
(e.g. a set of REDOs for an operation, terminated by an UNDO for this
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
operation); if there is no "end mark" record the group is incomplete and
won't be executed.
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
*/
if ((log_desc->record_in_group == LOGREC_IS_GROUP_ITSELF) ||
(log_desc->record_in_group == LOGREC_LAST_IN_GROUP))
{
if (all_active_trans[sid].group_start_lsn != LSN_IMPOSSIBLE)
{
if (log_desc->record_in_group == LOGREC_IS_GROUP_ITSELF)
{
/*
can happen if the transaction got a table write error, then
unlocked tables thus wrote a COMMIT record.
*/
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
tprint(tracef, "\nDiscarding incomplete group before this record\n");
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
all_active_trans[sid].group_start_lsn= LSN_IMPOSSIBLE;
}
else
{
struct st_translog_scanner_data scanner2;
TRANSLOG_HEADER_BUFFER rec2;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
/*
There is a complete group for this transaction, containing more
than this event.
*/
tprint(tracef, " ends a group:\n");
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
len=
translog_read_record_header(all_active_trans[sid].group_start_lsn,
&rec2);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
if (len < 0) /* EOF or error */
{
tprint(tracef, "Cannot find record where it should be\n");
goto err;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
if (translog_scanner_init(rec2.lsn, 1, &scanner2, 1))
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
{
tprint(tracef, "Scanner2 init failed\n");
goto err;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
current_group_end_lsn= rec.lsn;
do
{
if (rec2.short_trid == sid) /* it's in our group */
{
const LOG_DESC *log_desc2= &log_record_type_descriptor[rec2.type];
display_record_position(log_desc2, &rec2, 0);
if (apply == MARIA_LOG_CHECK)
{
translog_size_t read_len;
enlarge_buffer(&rec2);
read_len=
translog_read_record(rec2.lsn, 0, rec2.record_length,
log_record_buffer.str, NULL);
if (read_len != rec2.record_length)
{
tprint(tracef, "Cannot read record's body: read %u of"
" %u bytes\n", read_len, rec2.record_length);
goto err;
}
}
if (apply == MARIA_LOG_APPLY &&
display_and_apply_record(log_desc2, &rec2))
{
2007-10-01 08:59:05 +02:00
translog_destroy_scanner(&scanner2);
goto err;
}
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
len= translog_read_next_record_header(&scanner2, &rec2);
if (len < 0) /* EOF or error */
{
tprint(tracef, "Cannot find record where it should be\n");
goto err;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
}
while (rec2.lsn < rec.lsn);
translog_free_record_header(&rec2);
/* group finished */
all_active_trans[sid].group_start_lsn= LSN_IMPOSSIBLE;
current_group_end_lsn= LSN_IMPOSSIBLE; /* for debugging */
display_record_position(log_desc, &rec, 0);
translog_destroy_scanner(&scanner2);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
}
if (apply == MARIA_LOG_APPLY &&
display_and_apply_record(log_desc, &rec))
goto err;
* WL#4137 Maria- Framework for testing recovery in mysql-test-run See test maria-recovery.test for a model; all include scripts have an "API" section at start if they do take parameters from outside. * Fixing bug reported by Jani and Monty (when two REDOs about the same page in one group, see ma_blockrec.c). * Fixing small bugs in recovery mysql-test/include/wait_until_connected_again.inc: be sure to enter the loop (the previous query by the caller may not have failed: it could be query; mysqladmin shutdown; call this script). mysql-test/lib/mtr_process.pl: * Through the "expect" file a test can tell mtr that a server crash is expected. What the file contains is irrelevant. Now if its last line starts with "wait", mtr will wait before restarting (it will wait for the last line to not start with "wait"). This is for tests which need to mangle files under the feet of a dead mysqld. * Remove "expect" file before restarting; otherwise there could be a race condition: tests sees server restarted, does something, writes an "expect" file, and then mtr removes that file, then test kills mysqld, and then mtr will never restart it. storage/maria/ma_blockrec.c: - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - fixing bug in applying of REDO_PURGE_BLOCKS in recovery: page_range sometimes has TAIL_BIT set, need to turn it down to know the real page range. - Both bugs are covered in maria-recovery.test storage/maria/ma_checkpoint.c: Capability to, in debug builds only, do some special operations (flush all bitmap and data pages, flush state, flush log) and crash mysqld, to later test recovery. Driven by some --debug=d, symbols. storage/maria/ma_open.c: debugging info storage/maria/ma_pagecache.c: Now that we can _ma_unpin_all_pages() during the REDO phase to set page's LSN, the assertion needs to be relaxed. storage/maria/ma_recovery.c: - open trace file in append mode (useful when a test triggers several recoveries, we see them all). - fixing wrong error detection, it's possible that during recovery we want to open an already open table. - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - we verify that all log records of a group are about the same table, for debugging. mysql-test/r/maria-recovery.result: result mysql-test/t/maria-recovery-master.opt: crash is expected, core file would take room, stack trace would wake pushbuild up. mysql-test/t/maria-recovery.test: Test of recovery from mysql-test (it is already tested as unit tests in ma_test_recovery) (WL#4137) - test that, if recovery is made to start on an empty table it can replay the effects of committed and uncommitted statements (having only the committed ones in the end result). This should be the first test for someone writing code of new REDOs. - test that, if mysqld is crashed and recovery runs we have only committed statements in the end result. Crashes are done in different ways: flush nothing (so, uncommitted statement is often missing from the log => no rollback to do); flush pagecache (implicitely flushes log (WAL)) and flush log, both causes rollbacks; flush log can also flush state (state.records etc) to test recovery of the state (not tested well now as we repair the index anyway). - test of bug found by Jani and Monty in recovery (two REDO about the same page in one group). mysql-test/include/maria_empty_logs.inc: removes logs, to have a clean sheet for testing recovery. mysql-test/include/maria_make_snapshot.inc: copies a table to another directory, or back, or compares both (comparison is not implemented as physical comparison is impossible if an UNDO phase happened). mysql-test/include/maria_make_snapshot_for_comparison.inc: copies tables to another directory so that they can later serve as a comparison reference (they are the good tables, recovery should produce similar ones). mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: When we want to force recovery to start on old tables, we prepare old tables with this script: we put them in a spare directory. They are later copied back over mysqltest tables while mysqld is dead. We also need to copy back the control file, otherwise mysqld, in recovery, would start from the latest checkpoint: latest checkpoint plus old tables is not a recovery-possible scenario of course. mysql-test/include/maria_verify_recovery.inc: causes mysqld to crash, restores old tables if requested, lets recovery run, compares resulting tables with reference tables by using CHECKSUM TABLE. We don't do any sanity checks on page's LSN in resulting tables, yet.
2007-11-13 17:12:29 +01:00
#ifndef DBUG_OFF
current_group_table= NULL;
#endif
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
else /* record does not end group */
{
/* just record the fact, can't know if can execute yet */
if (all_active_trans[sid].group_start_lsn == LSN_IMPOSSIBLE)
{
/* group not yet started */
all_active_trans[sid].group_start_lsn= rec.lsn;
}
}
len= translog_read_next_record_header(&scanner, &rec);
if (len < 0)
{
switch (len)
{
case RECHEADER_READ_EOF:
tprint(tracef, "EOF on the log\n");
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
break;
case RECHEADER_READ_ERROR:
tprint(tracef, "Error reading log\n");
goto err;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
break;
}
}
translog_destroy_scanner(&scanner);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
translog_free_record_header(&rec);
if (recovery_message_printed == REC_MSG_REDO)
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
{
fprintf(stderr, " 100%%");
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
procent_printed= 1;
}
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
return 0;
err:
translog_destroy_scanner(&scanner);
return 1;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
/**
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
@brief Informs about any aborted groups or uncommitted transactions,
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
prepares for the UNDO phase if needed.
@note Observe that it may init trnman.
*/
static uint end_of_redo_phase(my_bool prepare_for_undo_phase)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
uint sid, uncommitted= 0;
char llbuf[22];
Remove SAFE_MODE for opt_range as it disables UPDATE to use keys REDO optimization (Bascily avoid moving blocks from/to pagecache) More command line arguments to maria_read_log Fixed recovery bug when recreating table sql/opt_range.cc: Remove SAFE_MODE for opt_range as it disables UPDATE to use keys storage/maria/ma_blockrec.c: REDO optimization Use new interface for pagecache_reads to avoid copying page buffers storage/maria/ma_loghandler.c: Patch from Sanja: - Added new parameter to translog_get_page to use direct links to pagecache - Changed scanner to be able to use direct links This avoids a lot of calls to bmove512() in page cache. storage/maria/ma_loghandler.h: Added direct link to pagecache objects storage/maria/ma_open.c: Added const to parameter Added missing braces storage/maria/ma_pagecache.c: From Sanja: - Added direct links to pagecache (from pagecache_read()) Dirrect link means that on pagecache_read we get back a pointer to the pagecache buffer From Monty: - Fixed arguments to init_page_cache to handle big page caches - Fixed compiler warnings - Replaced PAGECACHE_PAGE_LINK with PAGECACHE_BLOCK_LINK * to catch errors storage/maria/ma_pagecache.h: Changed block numbers from int to long to be able to handle big page caches Changed some PAGECACHE_PAGE_LINK to PAGECACHE_BLOCK_LINK storage/maria/ma_recovery.c: Fixed recovery bug when recreating table (table was kept open) Moved some variables to function start (portability) Added space to some print messages storage/maria/maria_chk.c: key_buffer_size -> page_buffer_size storage/maria/maria_def.h: Changed default page_buffer_size to 10M storage/maria/maria_read_log.c: Added more startup options: --version --undo (apply undo) --page_cache_size (to run with big cache sizes) --silent (to not get any output from --apply) storage/maria/unittest/ma_control_file-t.c: Fixed compiler warning storage/maria/unittest/ma_test_loghandler-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Added new argument to translog_init_scanner()
2007-09-27 13:18:28 +02:00
LSN addr;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
hash_free(&all_dirty_pages);
/*
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
hash_free() can be called multiple times probably, but be safe if that
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
changes
*/
bzero(&all_dirty_pages, sizeof(all_dirty_pages));
my_free(dirty_pages_pool, MYF(MY_ALLOW_ZERO_PTR));
dirty_pages_pool= NULL;
llstr(max_long_trid, llbuf);
tprint(tracef, "Maximum transaction long id seen: %s\n", llbuf);
if (prepare_for_undo_phase && trnman_init(max_long_trid))
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
return -1;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
for (sid= 0; sid <= SHORT_TRID_MAX; sid++)
{
TrID long_trid= all_active_trans[sid].long_trid;
LSN gslsn= all_active_trans[sid].group_start_lsn;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
TRN *trn;
if (gslsn != LSN_IMPOSSIBLE)
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
{
tprint(tracef, "Group at LSN (%lu,0x%lx) short_trid %u incomplete\n",
LSN_IN_PARTS(gslsn), sid);
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
all_active_trans[sid].group_start_lsn= LSN_IMPOSSIBLE;
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if (all_active_trans[sid].undo_lsn != LSN_IMPOSSIBLE)
{
llstr(long_trid, llbuf);
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
tprint(tracef, "Transaction long_trid %s short_trid %u uncommitted\n",
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
llbuf, sid);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
/* dummy_transaction_object serves only for DDLs */
DBUG_ASSERT(long_trid != 0);
if (prepare_for_undo_phase)
{
if ((trn= trnman_recreate_trn_from_recovery(sid, long_trid)) == NULL)
return -1;
trn->undo_lsn= all_active_trans[sid].undo_lsn;
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
trn->first_undo_lsn= all_active_trans[sid].first_undo_lsn |
TRANSACTION_LOGGED_LONG_ID; /* because trn is known in log */
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
if (gslsn != LSN_IMPOSSIBLE)
{
/*
UNDO phase will log some records. So, a future recovery may see:
REDO(from incomplete group) - REDO(from rollback) - CLR_END
and thus execute the first REDO (finding it in "a complete
group"). To prevent that:
*/
LEX_STRING log_array[TRANSLOG_INTERNAL_PARTS];
LSN lsn;
if (translog_write_record(&lsn, LOGREC_INCOMPLETE_GROUP,
trn, NULL, 0,
TRANSLOG_INTERNAL_PARTS, log_array,
NULL, NULL))
return -1;
}
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
uncommitted++;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
#ifdef MARIA_VERSIONING
/*
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
If real recovery: if transaction was committed, move it to some separate
list for soon purging.
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
*/
#endif
}
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
my_free(all_active_trans, MYF(MY_ALLOW_ZERO_PTR));
all_active_trans= NULL;
/*
The UNDO phase uses some normal run-time code of ROLLBACK: generates log
records, etc; prepare tables for that
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
*/
Remove SAFE_MODE for opt_range as it disables UPDATE to use keys REDO optimization (Bascily avoid moving blocks from/to pagecache) More command line arguments to maria_read_log Fixed recovery bug when recreating table sql/opt_range.cc: Remove SAFE_MODE for opt_range as it disables UPDATE to use keys storage/maria/ma_blockrec.c: REDO optimization Use new interface for pagecache_reads to avoid copying page buffers storage/maria/ma_loghandler.c: Patch from Sanja: - Added new parameter to translog_get_page to use direct links to pagecache - Changed scanner to be able to use direct links This avoids a lot of calls to bmove512() in page cache. storage/maria/ma_loghandler.h: Added direct link to pagecache objects storage/maria/ma_open.c: Added const to parameter Added missing braces storage/maria/ma_pagecache.c: From Sanja: - Added direct links to pagecache (from pagecache_read()) Dirrect link means that on pagecache_read we get back a pointer to the pagecache buffer From Monty: - Fixed arguments to init_page_cache to handle big page caches - Fixed compiler warnings - Replaced PAGECACHE_PAGE_LINK with PAGECACHE_BLOCK_LINK * to catch errors storage/maria/ma_pagecache.h: Changed block numbers from int to long to be able to handle big page caches Changed some PAGECACHE_PAGE_LINK to PAGECACHE_BLOCK_LINK storage/maria/ma_recovery.c: Fixed recovery bug when recreating table (table was kept open) Moved some variables to function start (portability) Added space to some print messages storage/maria/maria_chk.c: key_buffer_size -> page_buffer_size storage/maria/maria_def.h: Changed default page_buffer_size to 10M storage/maria/maria_read_log.c: Added more startup options: --version --undo (apply undo) --page_cache_size (to run with big cache sizes) --silent (to not get any output from --apply) storage/maria/unittest/ma_control_file-t.c: Fixed compiler warning storage/maria/unittest/ma_test_loghandler-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Added new argument to translog_init_scanner()
2007-09-27 13:18:28 +02:00
addr= translog_get_horizon();
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
for (sid= 0; sid <= SHARE_ID_MAX; sid++)
{
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
MARIA_HA *info= all_tables[sid].info;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if (info != NULL)
{
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
prepare_table_for_close(info, addr);
/*
But we don't close it; we leave it available for the UNDO phase;
it's likely that the UNDO phase will need it.
*/
if (prepare_for_undo_phase)
translog_assign_id_to_share_from_recovery(info->s, sid);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
}
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
return uncommitted;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
static int run_undo_phase(uint uncommitted)
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
{
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
if (uncommitted > 0)
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
{
WL#3071 - Maria checkpoint * Preparation for having a background checkpoint thread: frequency of checkpoint taken by that thread is now configurable by the user: global variable maria_checkpoint_frequency, in seconds, default 30 (checkpoint every 30th second); 0 means no checkpoints (and thus no background thread, thus no background flushing, that will probably only be used for testing). * Don't take checkpoints in Recovery if it didn't do anything significant; thus no checkpoint after a clean shutdown/restart. The only checkpoint which is never skipped is the one at shutdown. * fix for a test failure (after-merge fix) include/maria.h: new variable mysql-test/suite/rpl/r/rpl_row_flsh_tbls.result: result update mysql-test/suite/rpl/t/rpl_row_flsh_tbls.test: position update (=after merge fix, as this position was already changed into 5.1 and not merged here, causing test to fail) storage/maria/ha_maria.cc: Checkpoint's frequency is now configurable by the user: global variable maria_checkpoint_frequency. Changing it on the fly requires us to shutdown/restart the background checkpoint thread, as the loop done in that thread assumes a constant checkpoint interval. Default value is 30: a checkpoint every 30 seconds (yes, I know, physicists will remind that it should be named "period" then). ha_maria now asks for a background checkpoint thread when it starts, but this is still overruled (disabled) in ma_checkpoint_init(). storage/maria/ma_checkpoint.c: Checkpoint's frequency is now configurable by the user: background thread takes a checkpoint every maria_checkpoint_interval-th second. If that variable is 0, no checkpoints are taken. Note, I will enable the background thread only in a later changeset. storage/maria/ma_recovery.c: Don't take checkpoints at the end of the REDO phase and at the end of Recovery if Recovery didn't make anything significant (didn't open any tables, didn't rollback any transactions). With this, after a clean shutdown, Recovery shouldn't take any checkpoint, which makes starting faster (we save a few fsync()s of the log and control file).
2007-10-09 10:38:31 +02:00
checkpoint_useful= TRUE;
if (tracef != stdout)
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
{
if (recovery_message_printed == REC_MSG_NONE)
WL#3071 Maria checkpoint, WL#3072 Maria recovery instead of fprintf(stderr) when a task (with no user connected) gets an error, use my_printf_error(). Flags ME_JUST_WARNING and ME_JUST_INFO added to my_error()/my_printf_error(), which pass it to my_message_sql() which is modified to call the appropriate sql_print_*(). This way recovery can signal its start and end with [Note] and not [ERROR] (but failure with [ERROR]). Recovery's detailed progress (percents etc) still uses stderr as they have to stay on one single line. sql_print_error() changed to use my_progname_short (nicer display). mysql-test-run.pl --gdb/--ddd does not run mysqld, because a breakpoint in mysql_parse is too late to debug startup problems; instead, dev should set the breakpoints it wants and then "run" ("r"). include/my_sys.h: new flags to tell error_handler_hook that this is not an error but an information or warning mysql-test/mysql-test-run.pl: when running with --gdb/--ddd to debug mysqld, breaking at mysql_parse is too late to debug startup problems; now, it does not run mysqld, does not set breakpoints, developer can set as early breakpoints as it wants and is responsible for typing "run" (or "r") mysys/my_init.c: set my_progname_short mysys/my_static.c: my_progname_short added sql/mysqld.cc: * my_message_sql() can now receive info or warning, not only error; this allows mysys to tell the user (or the error log if no user) about an info or warning. Used from Maria. * plugins (or engines like Maria) may want to call my_error(), so set up the error handler hook (my_message_sql) before initializing plugins; otherwise they get my_message_no_curses which is less integrated into mysqld (is just fputs()) * using my_progname_short instead of my_progname, in my_message_sql() (less space on screen) storage/maria/ma_checkpoint.c: fprintf(stderr) -> ma_message_no_user() storage/maria/ma_checkpoint.h: function for any Maria task, not connected to a user (example: checkpoint, recovery; soon could be deleted records purger) to report a message (calls my_printf_error() which, when inside ha_maria, leads to sql_print_*(), and when outside, leads to my_message_no_curses i.e. stderr). storage/maria/ma_recovery.c: To tell that recovery starts and ends we use ma_message_no_user() (sql_print_*() in practice). Detailed progress info still uses stderr as sql_print() cannot put several messages on one line. 071116 18:42:16 [Note] mysqld: Maria engine: starting recovery recovered pages: 0% 67% 100% (0.0 seconds); transactions to roll back: 1 0 (0.0 seconds); tables to flush: 1 0 (0.0 seconds); 071116 18:42:16 [Note] mysqld: Maria engine: recovery done storage/maria/maria_chk.c: my_progname_short moved to mysys storage/maria/maria_read_log.c: my_progname_short moved to mysys storage/myisam/myisamchk.c: my_progname_short moved to mysys
2007-11-16 17:09:51 +01:00
print_preamble();
fprintf(stderr, "transactions to roll back:");
recovery_message_printed= REC_MSG_UNDO;
}
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
tprint(tracef, "%u transactions will be rolled back\n", uncommitted);
for( ; ; )
{
if (recovery_message_printed == REC_MSG_UNDO)
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
fprintf(stderr, " %u", uncommitted);
if ((uncommitted--) == 0)
break;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
char llbuf[22];
TRN *trn= trnman_get_any_trn();
DBUG_ASSERT(trn != NULL);
llstr(trn->trid, llbuf);
tprint(tracef, "Rolling back transaction of long id %s\n", llbuf);
/* Execute all undo entries */
while (trn->undo_lsn)
{
TRANSLOG_HEADER_BUFFER rec;
LOG_DESC *log_desc;
if (translog_read_record_header(trn->undo_lsn, &rec) ==
RECHEADER_READ_ERROR)
return 1;
log_desc= &log_record_type_descriptor[rec.type];
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
display_record_position(log_desc, &rec, 0);
if (log_desc->record_execute_in_undo_phase(&rec, trn))
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
tprint(tracef, "Got error %d when executing undo\n", my_errno);
return 1;
}
}
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
if (trnman_rollback_trn(trn))
return 1;
/* We could want to span a few threads (4?) instead of 1 */
/* In the future, we want to have this phase *online* */
}
}
return 0;
}
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
/**
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
@brief re-enables transactionality, updates is_of_horizon
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
@param info table
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
@param horizon address to set is_of_horizon
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
*/
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
static void prepare_table_for_close(MARIA_HA *info, TRANSLOG_ADDRESS horizon)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
MARIA_SHARE *share= info->s;
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
/*
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
In a fully-forward REDO phase (no checkpoint record),
state is now at least as new as the LSN of the current record. It may be
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
newer, in case we are seeing a LOGREC_FILE_ID which tells us to close a
table, but that table was later modified further in the log.
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
But if we parsed a checkpoint record, it may be this way in the log:
FILE_ID(6->t2)... FILE_ID(6->t1)... CHECKPOINT(6->t1)
Checkpoint parsing opened t1 with id 6; first FILE_ID above is going to
make t1 close; the first condition below is however false (when checkpoint
was taken it increased is_of_horizon) and so it works. For safety we
add the second condition.
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
*/
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
if (cmp_translog_addr(share->state.is_of_horizon, horizon) < 0 &&
cmp_translog_addr(share->lsn_of_file_id, horizon) < 0)
{
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
share->state.is_of_horizon= horizon;
_ma_state_info_write_sub(share->kfile.file, &share->state, 1);
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
_ma_reenable_logging_for_table(share);
* WL#4137 Maria- Framework for testing recovery in mysql-test-run See test maria-recovery.test for a model; all include scripts have an "API" section at start if they do take parameters from outside. * Fixing bug reported by Jani and Monty (when two REDOs about the same page in one group, see ma_blockrec.c). * Fixing small bugs in recovery mysql-test/include/wait_until_connected_again.inc: be sure to enter the loop (the previous query by the caller may not have failed: it could be query; mysqladmin shutdown; call this script). mysql-test/lib/mtr_process.pl: * Through the "expect" file a test can tell mtr that a server crash is expected. What the file contains is irrelevant. Now if its last line starts with "wait", mtr will wait before restarting (it will wait for the last line to not start with "wait"). This is for tests which need to mangle files under the feet of a dead mysqld. * Remove "expect" file before restarting; otherwise there could be a race condition: tests sees server restarted, does something, writes an "expect" file, and then mtr removes that file, then test kills mysqld, and then mtr will never restart it. storage/maria/ma_blockrec.c: - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - fixing bug in applying of REDO_PURGE_BLOCKS in recovery: page_range sometimes has TAIL_BIT set, need to turn it down to know the real page range. - Both bugs are covered in maria-recovery.test storage/maria/ma_checkpoint.c: Capability to, in debug builds only, do some special operations (flush all bitmap and data pages, flush state, flush log) and crash mysqld, to later test recovery. Driven by some --debug=d, symbols. storage/maria/ma_open.c: debugging info storage/maria/ma_pagecache.c: Now that we can _ma_unpin_all_pages() during the REDO phase to set page's LSN, the assertion needs to be relaxed. storage/maria/ma_recovery.c: - open trace file in append mode (useful when a test triggers several recoveries, we see them all). - fixing wrong error detection, it's possible that during recovery we want to open an already open table. - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - we verify that all log records of a group are about the same table, for debugging. mysql-test/r/maria-recovery.result: result mysql-test/t/maria-recovery-master.opt: crash is expected, core file would take room, stack trace would wake pushbuild up. mysql-test/t/maria-recovery.test: Test of recovery from mysql-test (it is already tested as unit tests in ma_test_recovery) (WL#4137) - test that, if recovery is made to start on an empty table it can replay the effects of committed and uncommitted statements (having only the committed ones in the end result). This should be the first test for someone writing code of new REDOs. - test that, if mysqld is crashed and recovery runs we have only committed statements in the end result. Crashes are done in different ways: flush nothing (so, uncommitted statement is often missing from the log => no rollback to do); flush pagecache (implicitely flushes log (WAL)) and flush log, both causes rollbacks; flush log can also flush state (state.records etc) to test recovery of the state (not tested well now as we repair the index anyway). - test of bug found by Jani and Monty in recovery (two REDO about the same page in one group). mysql-test/include/maria_empty_logs.inc: removes logs, to have a clean sheet for testing recovery. mysql-test/include/maria_make_snapshot.inc: copies a table to another directory, or back, or compares both (comparison is not implemented as physical comparison is impossible if an UNDO phase happened). mysql-test/include/maria_make_snapshot_for_comparison.inc: copies tables to another directory so that they can later serve as a comparison reference (they are the good tables, recovery should produce similar ones). mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: When we want to force recovery to start on old tables, we prepare old tables with this script: we put them in a spare directory. They are later copied back over mysqltest tables while mysqld is dead. We also need to copy back the control file, otherwise mysqld, in recovery, would start from the latest checkpoint: latest checkpoint plus old tables is not a recovery-possible scenario of course. mysql-test/include/maria_verify_recovery.inc: causes mysqld to crash, restores old tables if requested, lets recovery run, compares resulting tables with reference tables by using CHECKSUM TABLE. We don't do any sanity checks on page's LSN in resulting tables, yet.
2007-11-13 17:12:29 +01:00
info->trn= NULL; /* safety */
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
static MARIA_HA *get_MARIA_HA_from_REDO_record(const
TRANSLOG_HEADER_BUFFER *rec)
{
uint16 sid;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
pgcache_page_no_t page;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
MARIA_HA *info;
char llbuf[22];
Fixes for redo/undo logging of key pages New extendable format for maria_log_control file Fixed some compiler warnings include/maria.h: Added maria_disable_logging() and maria_enable_logging() mysql-test/include/maria_verify_recovery.inc: Updated tests now when key redo/undo works mysql-test/r/maria-recovery.result: Updated tests now when key redo/undo works storage/maria/ma_blockrec.c: Use unified CLR code Added rec_lsn for full pages Moved clr write hook to ma_key_recover.c Changed REDO code to keep pages pinned until undo Mark page_link's as changed storage/maria/ma_blockrec.h: Moved write_hook_for_clr_end() to ma_key_recover.c storage/maria/ma_check.c: Changed key check code to use PAGECACHE_READ_UNKNOWN_PAGE Fixed wrong warning when checking files after maria_pack When unpacking files, we have to use new keypos_to_recpos method When doing repair, we can disregard index key file pages in page cache storage/maria/ma_commit.c: Added simple enable/disable logging functions (Needed for recovery) storage/maria/ma_control_file.c: Make maria control file extendable without having to make it incompatible for older versions storage/maria/ma_control_file.h: New error messages Added CONTROL_FILE_VERSION storage/maria/ma_delete.c: Added redo/undo for key pages change_length -> changed_length to make things similar More comments & more DBUG storage/maria/ma_key_recover.c: Unified CLR method Moved here write_hook_for_clr_end() and common keypage log functions Changed REDO to keep pages pinned until undo Changed UNDO code to change key_root under log mutex storage/maria/ma_key_recover.h: New structures and functions storage/maria/ma_loghandler.c: Include needed files storage/maria/ma_open.c: Change maria_open() to use pread() instead of read() storage/maria/ma_page.c: Fixed bug in key_del handling Clear pages if IDENTICAL_PAGES_AFTER_RECOVERY is defined storage/maria/ma_pagecache.c: Indentation and spelling fixes More DBUG Added helper function: pagecache_block_link_to_buffer() storage/maria/ma_pagecache.h: Added pagecache_block_link_to_buffer() storage/maria/ma_recovery.c: Fixed state.changed Fixed that REDO keeps pages pinned until UNDO Some bug fixes from previous commit Fixes for UNDO/REDO of key pages storage/maria/ma_search.c: Fixed packing and storing of keys to provide more information to caller so that we can do efficent REDO logging of the changes. storage/maria/ma_test1.c: Fixed bug with not initialized variable storage/maria/ma_test2.c: Removed not used code storage/maria/ma_test_all.res: Updated results storage/maria/ma_test_all.sh: Changed one test to test more Removed timing tests as not relevant here storage/maria/ma_test_recovery.expected: Updated test result after redo/undo if key pages works storage/maria/ma_test_recovery: Updated test after redo/undo if key pages works storage/maria/ma_write.c: Moved some general log functions to ma_key_recover.c Fixed some bugs in undo Moved ma_log_split() to _ma_split_page() Small changes in some function arguments to be able to do redo logging storage/maria/maria_chk.c: disable logging while doing repair table storage/maria/maria_def.h: New function prototypes Move some structs and functions to ma_key_recover.c storage/maria/unittest/ma_control_file-t.c: Updated with patch from Sanja NOTE: This is not complete and need to be updated to new control file format storage/maria/unittest/ma_test_loghandler-t.c: Fixed compiler warning
2007-11-20 16:42:16 +01:00
my_bool index_page_redo_entry= 0;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
print_redo_phase_progress(rec->lsn);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
sid= fileid_korr(rec->header);
page= page_korr(rec->header + FILEID_STORE_SIZE);
Fixes for redo/undo logging of key pages New extendable format for maria_log_control file Fixed some compiler warnings include/maria.h: Added maria_disable_logging() and maria_enable_logging() mysql-test/include/maria_verify_recovery.inc: Updated tests now when key redo/undo works mysql-test/r/maria-recovery.result: Updated tests now when key redo/undo works storage/maria/ma_blockrec.c: Use unified CLR code Added rec_lsn for full pages Moved clr write hook to ma_key_recover.c Changed REDO code to keep pages pinned until undo Mark page_link's as changed storage/maria/ma_blockrec.h: Moved write_hook_for_clr_end() to ma_key_recover.c storage/maria/ma_check.c: Changed key check code to use PAGECACHE_READ_UNKNOWN_PAGE Fixed wrong warning when checking files after maria_pack When unpacking files, we have to use new keypos_to_recpos method When doing repair, we can disregard index key file pages in page cache storage/maria/ma_commit.c: Added simple enable/disable logging functions (Needed for recovery) storage/maria/ma_control_file.c: Make maria control file extendable without having to make it incompatible for older versions storage/maria/ma_control_file.h: New error messages Added CONTROL_FILE_VERSION storage/maria/ma_delete.c: Added redo/undo for key pages change_length -> changed_length to make things similar More comments & more DBUG storage/maria/ma_key_recover.c: Unified CLR method Moved here write_hook_for_clr_end() and common keypage log functions Changed REDO to keep pages pinned until undo Changed UNDO code to change key_root under log mutex storage/maria/ma_key_recover.h: New structures and functions storage/maria/ma_loghandler.c: Include needed files storage/maria/ma_open.c: Change maria_open() to use pread() instead of read() storage/maria/ma_page.c: Fixed bug in key_del handling Clear pages if IDENTICAL_PAGES_AFTER_RECOVERY is defined storage/maria/ma_pagecache.c: Indentation and spelling fixes More DBUG Added helper function: pagecache_block_link_to_buffer() storage/maria/ma_pagecache.h: Added pagecache_block_link_to_buffer() storage/maria/ma_recovery.c: Fixed state.changed Fixed that REDO keeps pages pinned until UNDO Some bug fixes from previous commit Fixes for UNDO/REDO of key pages storage/maria/ma_search.c: Fixed packing and storing of keys to provide more information to caller so that we can do efficent REDO logging of the changes. storage/maria/ma_test1.c: Fixed bug with not initialized variable storage/maria/ma_test2.c: Removed not used code storage/maria/ma_test_all.res: Updated results storage/maria/ma_test_all.sh: Changed one test to test more Removed timing tests as not relevant here storage/maria/ma_test_recovery.expected: Updated test result after redo/undo if key pages works storage/maria/ma_test_recovery: Updated test after redo/undo if key pages works storage/maria/ma_write.c: Moved some general log functions to ma_key_recover.c Fixed some bugs in undo Moved ma_log_split() to _ma_split_page() Small changes in some function arguments to be able to do redo logging storage/maria/maria_chk.c: disable logging while doing repair table storage/maria/maria_def.h: New function prototypes Move some structs and functions to ma_key_recover.c storage/maria/unittest/ma_control_file-t.c: Updated with patch from Sanja NOTE: This is not complete and need to be updated to new control file format storage/maria/unittest/ma_test_loghandler-t.c: Fixed compiler warning
2007-11-20 16:42:16 +01:00
switch (rec->type) {
/* not all REDO records have a page: */
Fixes for redo/undo logging of key pages New extendable format for maria_log_control file Fixed some compiler warnings include/maria.h: Added maria_disable_logging() and maria_enable_logging() mysql-test/include/maria_verify_recovery.inc: Updated tests now when key redo/undo works mysql-test/r/maria-recovery.result: Updated tests now when key redo/undo works storage/maria/ma_blockrec.c: Use unified CLR code Added rec_lsn for full pages Moved clr write hook to ma_key_recover.c Changed REDO code to keep pages pinned until undo Mark page_link's as changed storage/maria/ma_blockrec.h: Moved write_hook_for_clr_end() to ma_key_recover.c storage/maria/ma_check.c: Changed key check code to use PAGECACHE_READ_UNKNOWN_PAGE Fixed wrong warning when checking files after maria_pack When unpacking files, we have to use new keypos_to_recpos method When doing repair, we can disregard index key file pages in page cache storage/maria/ma_commit.c: Added simple enable/disable logging functions (Needed for recovery) storage/maria/ma_control_file.c: Make maria control file extendable without having to make it incompatible for older versions storage/maria/ma_control_file.h: New error messages Added CONTROL_FILE_VERSION storage/maria/ma_delete.c: Added redo/undo for key pages change_length -> changed_length to make things similar More comments & more DBUG storage/maria/ma_key_recover.c: Unified CLR method Moved here write_hook_for_clr_end() and common keypage log functions Changed REDO to keep pages pinned until undo Changed UNDO code to change key_root under log mutex storage/maria/ma_key_recover.h: New structures and functions storage/maria/ma_loghandler.c: Include needed files storage/maria/ma_open.c: Change maria_open() to use pread() instead of read() storage/maria/ma_page.c: Fixed bug in key_del handling Clear pages if IDENTICAL_PAGES_AFTER_RECOVERY is defined storage/maria/ma_pagecache.c: Indentation and spelling fixes More DBUG Added helper function: pagecache_block_link_to_buffer() storage/maria/ma_pagecache.h: Added pagecache_block_link_to_buffer() storage/maria/ma_recovery.c: Fixed state.changed Fixed that REDO keeps pages pinned until UNDO Some bug fixes from previous commit Fixes for UNDO/REDO of key pages storage/maria/ma_search.c: Fixed packing and storing of keys to provide more information to caller so that we can do efficent REDO logging of the changes. storage/maria/ma_test1.c: Fixed bug with not initialized variable storage/maria/ma_test2.c: Removed not used code storage/maria/ma_test_all.res: Updated results storage/maria/ma_test_all.sh: Changed one test to test more Removed timing tests as not relevant here storage/maria/ma_test_recovery.expected: Updated test result after redo/undo if key pages works storage/maria/ma_test_recovery: Updated test after redo/undo if key pages works storage/maria/ma_write.c: Moved some general log functions to ma_key_recover.c Fixed some bugs in undo Moved ma_log_split() to _ma_split_page() Small changes in some function arguments to be able to do redo logging storage/maria/maria_chk.c: disable logging while doing repair table storage/maria/maria_def.h: New function prototypes Move some structs and functions to ma_key_recover.c storage/maria/unittest/ma_control_file-t.c: Updated with patch from Sanja NOTE: This is not complete and need to be updated to new control file format storage/maria/unittest/ma_test_loghandler-t.c: Fixed compiler warning
2007-11-20 16:42:16 +01:00
case LOGREC_REDO_INDEX_NEW_PAGE:
case LOGREC_REDO_INDEX:
case LOGREC_REDO_INDEX_FREE_PAGE:
index_page_redo_entry= 1;
/* Fall trough*/
case LOGREC_REDO_INSERT_ROW_HEAD:
case LOGREC_REDO_INSERT_ROW_TAIL:
case LOGREC_REDO_PURGE_ROW_HEAD:
case LOGREC_REDO_PURGE_ROW_TAIL:
llstr(page, llbuf);
tprint(tracef, " For page %s of table of short id %u", llbuf, sid);
break;
/* other types could print their info here too */
default:
break;
}
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
info= all_tables[sid].info;
* WL#4137 Maria- Framework for testing recovery in mysql-test-run See test maria-recovery.test for a model; all include scripts have an "API" section at start if they do take parameters from outside. * Fixing bug reported by Jani and Monty (when two REDOs about the same page in one group, see ma_blockrec.c). * Fixing small bugs in recovery mysql-test/include/wait_until_connected_again.inc: be sure to enter the loop (the previous query by the caller may not have failed: it could be query; mysqladmin shutdown; call this script). mysql-test/lib/mtr_process.pl: * Through the "expect" file a test can tell mtr that a server crash is expected. What the file contains is irrelevant. Now if its last line starts with "wait", mtr will wait before restarting (it will wait for the last line to not start with "wait"). This is for tests which need to mangle files under the feet of a dead mysqld. * Remove "expect" file before restarting; otherwise there could be a race condition: tests sees server restarted, does something, writes an "expect" file, and then mtr removes that file, then test kills mysqld, and then mtr will never restart it. storage/maria/ma_blockrec.c: - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - fixing bug in applying of REDO_PURGE_BLOCKS in recovery: page_range sometimes has TAIL_BIT set, need to turn it down to know the real page range. - Both bugs are covered in maria-recovery.test storage/maria/ma_checkpoint.c: Capability to, in debug builds only, do some special operations (flush all bitmap and data pages, flush state, flush log) and crash mysqld, to later test recovery. Driven by some --debug=d, symbols. storage/maria/ma_open.c: debugging info storage/maria/ma_pagecache.c: Now that we can _ma_unpin_all_pages() during the REDO phase to set page's LSN, the assertion needs to be relaxed. storage/maria/ma_recovery.c: - open trace file in append mode (useful when a test triggers several recoveries, we see them all). - fixing wrong error detection, it's possible that during recovery we want to open an already open table. - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - we verify that all log records of a group are about the same table, for debugging. mysql-test/r/maria-recovery.result: result mysql-test/t/maria-recovery-master.opt: crash is expected, core file would take room, stack trace would wake pushbuild up. mysql-test/t/maria-recovery.test: Test of recovery from mysql-test (it is already tested as unit tests in ma_test_recovery) (WL#4137) - test that, if recovery is made to start on an empty table it can replay the effects of committed and uncommitted statements (having only the committed ones in the end result). This should be the first test for someone writing code of new REDOs. - test that, if mysqld is crashed and recovery runs we have only committed statements in the end result. Crashes are done in different ways: flush nothing (so, uncommitted statement is often missing from the log => no rollback to do); flush pagecache (implicitely flushes log (WAL)) and flush log, both causes rollbacks; flush log can also flush state (state.records etc) to test recovery of the state (not tested well now as we repair the index anyway). - test of bug found by Jani and Monty in recovery (two REDO about the same page in one group). mysql-test/include/maria_empty_logs.inc: removes logs, to have a clean sheet for testing recovery. mysql-test/include/maria_make_snapshot.inc: copies a table to another directory, or back, or compares both (comparison is not implemented as physical comparison is impossible if an UNDO phase happened). mysql-test/include/maria_make_snapshot_for_comparison.inc: copies tables to another directory so that they can later serve as a comparison reference (they are the good tables, recovery should produce similar ones). mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: When we want to force recovery to start on old tables, we prepare old tables with this script: we put them in a spare directory. They are later copied back over mysqltest tables while mysqld is dead. We also need to copy back the control file, otherwise mysqld, in recovery, would start from the latest checkpoint: latest checkpoint plus old tables is not a recovery-possible scenario of course. mysql-test/include/maria_verify_recovery.inc: causes mysqld to crash, restores old tables if requested, lets recovery run, compares resulting tables with reference tables by using CHECKSUM TABLE. We don't do any sanity checks on page's LSN in resulting tables, yet.
2007-11-13 17:12:29 +01:00
#ifndef DBUG_OFF
DBUG_ASSERT(current_group_table == NULL || current_group_table == info);
current_group_table= info;
#endif
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if (info == NULL)
{
tprint(tracef, ", table skipped, so skipping record\n");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
return NULL;
}
tprint(tracef, ", '%s'", info->s->open_file_name);
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
if (cmp_translog_addr(rec->lsn, info->s->lsn_of_file_id) <= 0)
{
/*
This can happen only if processing a record before the checkpoint
record.
id->name mapping is newer than REDO record: for sure the table subject
of the REDO has been flushed and forced (id re-assignment implies this);
REDO can be ignored (and must be, as we don't know what this subject
table was).
*/
DBUG_ASSERT(cmp_translog_addr(rec->lsn, checkpoint_start) < 0);
tprint(tracef, ", table's LOGREC_FILE_ID has LSN (%lu,0x%lx) more recent"
" than record, skipping record",
LSN_IN_PARTS(info->s->lsn_of_file_id));
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
return NULL;
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
/* detect if an open instance of a dropped table (internal bug) */
DBUG_ASSERT(info->s->last_version != 0);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
if (cmp_translog_addr(rec->lsn, checkpoint_start) < 0)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
uint64 file_and_page_id=
Fixes for redo/undo logging of key pages New extendable format for maria_log_control file Fixed some compiler warnings include/maria.h: Added maria_disable_logging() and maria_enable_logging() mysql-test/include/maria_verify_recovery.inc: Updated tests now when key redo/undo works mysql-test/r/maria-recovery.result: Updated tests now when key redo/undo works storage/maria/ma_blockrec.c: Use unified CLR code Added rec_lsn for full pages Moved clr write hook to ma_key_recover.c Changed REDO code to keep pages pinned until undo Mark page_link's as changed storage/maria/ma_blockrec.h: Moved write_hook_for_clr_end() to ma_key_recover.c storage/maria/ma_check.c: Changed key check code to use PAGECACHE_READ_UNKNOWN_PAGE Fixed wrong warning when checking files after maria_pack When unpacking files, we have to use new keypos_to_recpos method When doing repair, we can disregard index key file pages in page cache storage/maria/ma_commit.c: Added simple enable/disable logging functions (Needed for recovery) storage/maria/ma_control_file.c: Make maria control file extendable without having to make it incompatible for older versions storage/maria/ma_control_file.h: New error messages Added CONTROL_FILE_VERSION storage/maria/ma_delete.c: Added redo/undo for key pages change_length -> changed_length to make things similar More comments & more DBUG storage/maria/ma_key_recover.c: Unified CLR method Moved here write_hook_for_clr_end() and common keypage log functions Changed REDO to keep pages pinned until undo Changed UNDO code to change key_root under log mutex storage/maria/ma_key_recover.h: New structures and functions storage/maria/ma_loghandler.c: Include needed files storage/maria/ma_open.c: Change maria_open() to use pread() instead of read() storage/maria/ma_page.c: Fixed bug in key_del handling Clear pages if IDENTICAL_PAGES_AFTER_RECOVERY is defined storage/maria/ma_pagecache.c: Indentation and spelling fixes More DBUG Added helper function: pagecache_block_link_to_buffer() storage/maria/ma_pagecache.h: Added pagecache_block_link_to_buffer() storage/maria/ma_recovery.c: Fixed state.changed Fixed that REDO keeps pages pinned until UNDO Some bug fixes from previous commit Fixes for UNDO/REDO of key pages storage/maria/ma_search.c: Fixed packing and storing of keys to provide more information to caller so that we can do efficent REDO logging of the changes. storage/maria/ma_test1.c: Fixed bug with not initialized variable storage/maria/ma_test2.c: Removed not used code storage/maria/ma_test_all.res: Updated results storage/maria/ma_test_all.sh: Changed one test to test more Removed timing tests as not relevant here storage/maria/ma_test_recovery.expected: Updated test result after redo/undo if key pages works storage/maria/ma_test_recovery: Updated test after redo/undo if key pages works storage/maria/ma_write.c: Moved some general log functions to ma_key_recover.c Fixed some bugs in undo Moved ma_log_split() to _ma_split_page() Small changes in some function arguments to be able to do redo logging storage/maria/maria_chk.c: disable logging while doing repair table storage/maria/maria_def.h: New function prototypes Move some structs and functions to ma_key_recover.c storage/maria/unittest/ma_control_file-t.c: Updated with patch from Sanja NOTE: This is not complete and need to be updated to new control file format storage/maria/unittest/ma_test_loghandler-t.c: Fixed compiler warning
2007-11-20 16:42:16 +01:00
(((uint64) (index_page_redo_entry ? all_tables[sid].org_kfile :
all_tables[sid].org_dfile)) << 32) | page;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
struct st_dirty_page *dirty_page= (struct st_dirty_page *)
hash_search(&all_dirty_pages,
(uchar *)&file_and_page_id, sizeof(file_and_page_id));
if ((dirty_page == NULL) ||
cmp_translog_addr(rec->lsn, dirty_page->rec_lsn) < 0)
{
tprint(tracef, ", ignoring because of dirty_pages list\n");
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
return NULL;
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
/*
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
So we are going to read the page, and if its LSN is older than the
record's we will modify the page
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
*/
tprint(tracef, ", applying record\n");
- speed optimization: minimize writes to transactional Maria tables: don't write data pages, state, and open_count at the end of each statement. Data pages will be written by a background thread periodically. State will be written by Checkpoint periodically. open_count serves to detect when a table is potentially damaged due to an unclean mysqld stop, but thanks to recovery an unclean mysqld stop will be corrected and so open_count becomes useless. As state is written less often, it is often obsolete on disk, we thus should avoid to read it from disk. - by removing the data page writes above, it is necessary to put it back at the start of some statements like check, repair and delete_all. It was already necessary in fact (see ma_delete_all.c). - disabling CACHE INDEX on Maria tables for now (fixes crash of test 'key_cache' when run with --default-storage-engine=maria). - correcting some fishy code in maria_extra.c (we possibly could lose index pages when doing a DROP TABLE under Windows, in theory). storage/maria/ha_maria.cc: disable CACHE INDEX in Maria for now (there is a single cache for now), it crashes and it's not a priority storage/maria/ma_bitmap.c: debug message storage/maria/ma_check.c: The statement before maria_repair() may not flush state, so it needs to be done by maria_repair() (indeed this function uses maria_open(HA_OPEN_COPY) so reads state from disk, so needs to find it up-to-date on disk). For safety (but normally this is not needed) we remove index blocks out of the cache before repairing. _ma_flush_blocks() becomes _ma_flush_table_files_after_repair(): it now additionally flushes the data file and state and syncs files. As a side effect, the assertion "no WRITE_CACHE_USED" from _ma_flush_table_files() fired so we move all end_io_cache() done at the end of repair to before the calls to _ma_flush_table_files_after_repair(). storage/maria/ma_close.c: when closing a transactional table, we fsync it. But we need to do this only after writing its state. We need to write the state at close time only for transactional tables (the other tables do that at last unlock). Putting back the O_RDONLY||crashed condition which I had removed earlier. Unmap the file before syncing it (does not matter now as Maria does not use mmap) storage/maria/ma_delete_all.c: need to flush data pages before chsize-ing it. Was needed even when we flushed data pages at the end of each statement, because we didn't anyway do it if under LOCK TABLES: the change here thus fixes this bug: create table t(a int) engine=maria;lock tables t write; insert into t values(1);delete from t;unlock tables;check table t; "Size of datafile is: 16384 Should be: 8192" (an obsolete page went to disk after the chsize(), at unlock time). storage/maria/ma_extra.c: When doing share->last_version=0, we make the MARIA_SHARE-in-memory invisible to future openers, so need to have an up-to-date state on disk for them. The same way, future openers will reopen the data and index file, so they will not find our cached blocks, so we need to flush them to disk. In HA_EXTRA_FORCE_REOPEN, this probably happens naturally as all tables normally get closed, we however add a safety flush. In HA_EXTRA_PREPARE_FOR_RENAME, we need to do the flushing. On Windows we additionally need to close files. In HA_EXTRA_PREPARE_FOR_DROP, we don't need to flush anything but remove dirty cached blocks from memory. On Windows we need to close files. Closing files forces us to sync them before (requirement for transactional tables). For mutex reasons (don't lock intern_lock twice), we move maria_lock_database() and _ma_decrement_open_count() first in the list of operations. Flush also data file in HA_EXTRA_FLUSH. storage/maria/ma_locking.c: For transactional tables: - don't write data pages / state at unlock time; as a consequence, "share->changed=0" cannot be done. - don't write state in _ma_writeinfo() - don't maintain open_count on disk (Recovery corrects the table in case of crash anyway, and we gain speed by not writing open_count to disk), For non-transactional tables, flush the state at unlock only if the table was changed (optimization). Code which read the state from disk is relevant only with external locking, we disable it (if want to re-enable it, it shouldn't for transactional tables as state on disk may be obsolete (such tables does not flush state at unlock anymore). The comment "We have to flush the write cache" is now wrong because maria_lock_database(F_UNLCK) now happens before thr_unlock(), and we are not using external locking. storage/maria/ma_open.c: _ma_state_info_read() is only used in ma_open.c, making it static storage/maria/ma_recovery.c: set MARIA_SHARE::changed to TRUE when we are going to apply a REDO/UNDO, so that the state gets flushed at close. storage/maria/ma_test_recovery.expected: Changes introduced by this patch: - good: the "open" (table open, not properly closed) is gone, it was pointless for a recovered table - bad: stemming from different moments of writing the index's state probably (_ma_writeinfo() used to write the state after every row write in ma_test* programs, doesn't anymore as the table is transactional): some differences in indexes (not relevant as we don't yet have recovery for them); some differences in count of records (changed from a wrong value to another wrong value) (not relevant as we don't recover this count correctly yet anyway, though a patch will be pushed soon). storage/maria/ma_test_recovery: for repeatable output, no names of varying directories. storage/maria/maria_chk.c: function renamed storage/maria/maria_def.h: Function became local to ma_open.c. Function renamed.
2007-09-06 16:53:26 +02:00
_ma_writeinfo(info, WRITEINFO_UPDATE_KEYFILE); /* to flush state on close */
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
return info;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
static MARIA_HA *get_MARIA_HA_from_UNDO_record(const
TRANSLOG_HEADER_BUFFER *rec)
{
uint16 sid;
MARIA_HA *info;
sid= fileid_korr(rec->header + LSN_STORE_SIZE);
tprint(tracef, " For table of short id %u", sid);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
info= all_tables[sid].info;
* WL#4137 Maria- Framework for testing recovery in mysql-test-run See test maria-recovery.test for a model; all include scripts have an "API" section at start if they do take parameters from outside. * Fixing bug reported by Jani and Monty (when two REDOs about the same page in one group, see ma_blockrec.c). * Fixing small bugs in recovery mysql-test/include/wait_until_connected_again.inc: be sure to enter the loop (the previous query by the caller may not have failed: it could be query; mysqladmin shutdown; call this script). mysql-test/lib/mtr_process.pl: * Through the "expect" file a test can tell mtr that a server crash is expected. What the file contains is irrelevant. Now if its last line starts with "wait", mtr will wait before restarting (it will wait for the last line to not start with "wait"). This is for tests which need to mangle files under the feet of a dead mysqld. * Remove "expect" file before restarting; otherwise there could be a race condition: tests sees server restarted, does something, writes an "expect" file, and then mtr removes that file, then test kills mysqld, and then mtr will never restart it. storage/maria/ma_blockrec.c: - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - fixing bug in applying of REDO_PURGE_BLOCKS in recovery: page_range sometimes has TAIL_BIT set, need to turn it down to know the real page range. - Both bugs are covered in maria-recovery.test storage/maria/ma_checkpoint.c: Capability to, in debug builds only, do some special operations (flush all bitmap and data pages, flush state, flush log) and crash mysqld, to later test recovery. Driven by some --debug=d, symbols. storage/maria/ma_open.c: debugging info storage/maria/ma_pagecache.c: Now that we can _ma_unpin_all_pages() during the REDO phase to set page's LSN, the assertion needs to be relaxed. storage/maria/ma_recovery.c: - open trace file in append mode (useful when a test triggers several recoveries, we see them all). - fixing wrong error detection, it's possible that during recovery we want to open an already open table. - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - we verify that all log records of a group are about the same table, for debugging. mysql-test/r/maria-recovery.result: result mysql-test/t/maria-recovery-master.opt: crash is expected, core file would take room, stack trace would wake pushbuild up. mysql-test/t/maria-recovery.test: Test of recovery from mysql-test (it is already tested as unit tests in ma_test_recovery) (WL#4137) - test that, if recovery is made to start on an empty table it can replay the effects of committed and uncommitted statements (having only the committed ones in the end result). This should be the first test for someone writing code of new REDOs. - test that, if mysqld is crashed and recovery runs we have only committed statements in the end result. Crashes are done in different ways: flush nothing (so, uncommitted statement is often missing from the log => no rollback to do); flush pagecache (implicitely flushes log (WAL)) and flush log, both causes rollbacks; flush log can also flush state (state.records etc) to test recovery of the state (not tested well now as we repair the index anyway). - test of bug found by Jani and Monty in recovery (two REDO about the same page in one group). mysql-test/include/maria_empty_logs.inc: removes logs, to have a clean sheet for testing recovery. mysql-test/include/maria_make_snapshot.inc: copies a table to another directory, or back, or compares both (comparison is not implemented as physical comparison is impossible if an UNDO phase happened). mysql-test/include/maria_make_snapshot_for_comparison.inc: copies tables to another directory so that they can later serve as a comparison reference (they are the good tables, recovery should produce similar ones). mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: When we want to force recovery to start on old tables, we prepare old tables with this script: we put them in a spare directory. They are later copied back over mysqltest tables while mysqld is dead. We also need to copy back the control file, otherwise mysqld, in recovery, would start from the latest checkpoint: latest checkpoint plus old tables is not a recovery-possible scenario of course. mysql-test/include/maria_verify_recovery.inc: causes mysqld to crash, restores old tables if requested, lets recovery run, compares resulting tables with reference tables by using CHECKSUM TABLE. We don't do any sanity checks on page's LSN in resulting tables, yet.
2007-11-13 17:12:29 +01:00
#ifndef DBUG_OFF
DBUG_ASSERT(current_group_table == NULL || current_group_table == info);
current_group_table= info;
#endif
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if (info == NULL)
{
tprint(tracef, ", table skipped, so skipping record\n");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
return NULL;
}
tprint(tracef, ", '%s'", info->s->open_file_name);
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
if (cmp_translog_addr(rec->lsn, info->s->lsn_of_file_id) <= 0)
{
tprint(tracef, ", table's LOGREC_FILE_ID has LSN (%lu,0x%lx) more recent"
" than record, skipping record",
LSN_IN_PARTS(info->s->lsn_of_file_id));
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
return NULL;
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
DBUG_ASSERT(info->s->last_version != 0);
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
_ma_writeinfo(info, WRITEINFO_UPDATE_KEYFILE); /* to flush state on close */
tprint(tracef, ", applying record\n");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
return info;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
/**
@brief Parses checkpoint record.
Builds from it the dirty_pages list (a hash), opens tables and maps them to
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
their 2-byte IDs, recreates transactions (not real TRNs though).
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
@return LSN from where in the log the REDO phase should start
@retval LSN_ERROR error
@retval other ok
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
*/
static LSN parse_checkpoint_record(LSN lsn)
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
{
ulong i;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
TRANSLOG_HEADER_BUFFER rec;
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
TRANSLOG_ADDRESS start_address;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
tprint(tracef, "Loading data from checkpoint record at LSN (%lu,0x%lx)\n",
LSN_IN_PARTS(lsn));
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
int len= translog_read_record_header(lsn, &rec);
if (len == RECHEADER_READ_ERROR)
{
tprint(tracef, "Cannot find checkpoint record where it should be\n");
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
return LSN_ERROR;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
enlarge_buffer(&rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec.lsn, 0, rec.record_length,
log_record_buffer.str, NULL) !=
rec.record_length)
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Failed to read record\n");
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
return LSN_ERROR;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
char *ptr= log_record_buffer.str;
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
start_address= lsn_korr(ptr);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
ptr+= LSN_STORE_SIZE;
/* transactions */
uint nb_active_transactions= uint2korr(ptr);
ptr+= 2;
tprint(tracef, "%u active transactions\n", nb_active_transactions);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
LSN minimum_rec_lsn_of_active_transactions= lsn_korr(ptr);
ptr+= LSN_STORE_SIZE;
/*
how much brain juice and discussions there was to come to writing this
line
*/
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
set_if_smaller(start_address, minimum_rec_lsn_of_active_transactions);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
for (i= 0; i < nb_active_transactions; i++)
{
uint16 sid= uint2korr(ptr);
ptr+= 2;
TrID long_id= uint6korr(ptr);
ptr+= 6;
DBUG_ASSERT(sid > 0 && long_id > 0);
LSN undo_lsn= lsn_korr(ptr);
ptr+= LSN_STORE_SIZE;
LSN first_undo_lsn= lsn_korr(ptr);
ptr+= LSN_STORE_SIZE;
new_transaction(sid, long_id, undo_lsn, first_undo_lsn);
}
uint nb_committed_transactions= uint4korr(ptr);
ptr+= 4;
tprint(tracef, "%lu committed transactions\n",
(ulong)nb_committed_transactions);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
/* no purging => committed transactions are not important */
ptr+= (6 + LSN_STORE_SIZE) * nb_committed_transactions;
/* tables */
uint nb_tables= uint4korr(ptr);
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
ptr+= 4;
tprint(tracef, "%u open tables\n", nb_tables);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
for (i= 0; i< nb_tables; i++)
{
char name[FN_REFLEN];
uint16 sid= uint2korr(ptr);
ptr+= 2;
DBUG_ASSERT(sid > 0);
File kfile= uint4korr(ptr);
ptr+= 4;
File dfile= uint4korr(ptr);
ptr+= 4;
LSN first_log_write_lsn= lsn_korr(ptr);
ptr+= LSN_STORE_SIZE;
uint name_len= strlen(ptr) + 1;
strmake(name, ptr, sizeof(name)-1);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
ptr+= name_len;
if (new_table(sid, name, kfile, dfile, first_log_write_lsn))
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
return LSN_ERROR;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
/* dirty pages */
ulong nb_dirty_pages= uint8korr(ptr);
ptr+= 8;
tprint(tracef, "%lu dirty pages\n", nb_dirty_pages);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
if (hash_init(&all_dirty_pages, &my_charset_bin, nb_dirty_pages,
offsetof(struct st_dirty_page, file_and_page_id),
sizeof(((struct st_dirty_page *)NULL)->file_and_page_id),
NULL, NULL, 0))
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
return LSN_ERROR;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
dirty_pages_pool=
(struct st_dirty_page *)my_malloc(nb_dirty_pages *
sizeof(struct st_dirty_page),
MYF(MY_WME));
if (unlikely(dirty_pages_pool == NULL))
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
return LSN_ERROR;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
struct st_dirty_page *next_dirty_page_in_pool= dirty_pages_pool;
LSN minimum_rec_lsn_of_dirty_pages= LSN_MAX;
for (i= 0; i < nb_dirty_pages ; i++)
{
File fileid= uint4korr(ptr);
ptr+= 4;
pgcache_page_no_t pageid= uint4korr(ptr);
ptr+= 4;
LSN rec_lsn= lsn_korr(ptr);
ptr+= LSN_STORE_SIZE;
if (new_page(fileid, pageid, rec_lsn, next_dirty_page_in_pool++))
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
return LSN_ERROR;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
set_if_smaller(minimum_rec_lsn_of_dirty_pages, rec_lsn);
}
/* after that, there will be no insert/delete into the hash */
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
/*
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
sanity check on record (did we screw up with all those "ptr+=", did the
checkpoint write code and checkpoint read code go out of sync?).
*/
if (ptr != (log_record_buffer.str + log_record_buffer.length))
{
tprint(tracef, "checkpoint record corrupted\n");
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
return LSN_ERROR;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
set_if_smaller(start_address, minimum_rec_lsn_of_dirty_pages);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
/*
Find LSN higher or equal to this TRANSLOG_ADDRESS, suitable for
translog_read_record() functions
*/
checkpoint_start= translog_next_LSN(start_address, LSN_IMPOSSIBLE);
if (checkpoint_start == LSN_IMPOSSIBLE)
{
/*
There must be a problem, as our checkpoint record exists and is >= the
address which is stored in its first bytes, which is >= start_address.
*/
return LSN_ERROR;
}
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
return checkpoint_start;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
static int new_page(File fileid, pgcache_page_no_t pageid, LSN rec_lsn,
struct st_dirty_page *dirty_page)
{
/* serves as hash key */
dirty_page->file_and_page_id= (((uint64)fileid) << 32) | pageid;
dirty_page->rec_lsn= rec_lsn;
return my_hash_insert(&all_dirty_pages, (uchar *)dirty_page);
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
static int close_all_tables(void)
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
{
int error= 0;
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
uint count= 0;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
LIST *list_element, *next_open;
MARIA_HA *info;
pthread_mutex_lock(&THR_LOCK_maria);
if (maria_open_list == NULL)
goto end;
tprint(tracef, "Closing all tables\n");
if (tracef != stdout)
{
if (recovery_message_printed == REC_MSG_NONE)
WL#3071 Maria checkpoint, WL#3072 Maria recovery instead of fprintf(stderr) when a task (with no user connected) gets an error, use my_printf_error(). Flags ME_JUST_WARNING and ME_JUST_INFO added to my_error()/my_printf_error(), which pass it to my_message_sql() which is modified to call the appropriate sql_print_*(). This way recovery can signal its start and end with [Note] and not [ERROR] (but failure with [ERROR]). Recovery's detailed progress (percents etc) still uses stderr as they have to stay on one single line. sql_print_error() changed to use my_progname_short (nicer display). mysql-test-run.pl --gdb/--ddd does not run mysqld, because a breakpoint in mysql_parse is too late to debug startup problems; instead, dev should set the breakpoints it wants and then "run" ("r"). include/my_sys.h: new flags to tell error_handler_hook that this is not an error but an information or warning mysql-test/mysql-test-run.pl: when running with --gdb/--ddd to debug mysqld, breaking at mysql_parse is too late to debug startup problems; now, it does not run mysqld, does not set breakpoints, developer can set as early breakpoints as it wants and is responsible for typing "run" (or "r") mysys/my_init.c: set my_progname_short mysys/my_static.c: my_progname_short added sql/mysqld.cc: * my_message_sql() can now receive info or warning, not only error; this allows mysys to tell the user (or the error log if no user) about an info or warning. Used from Maria. * plugins (or engines like Maria) may want to call my_error(), so set up the error handler hook (my_message_sql) before initializing plugins; otherwise they get my_message_no_curses which is less integrated into mysqld (is just fputs()) * using my_progname_short instead of my_progname, in my_message_sql() (less space on screen) storage/maria/ma_checkpoint.c: fprintf(stderr) -> ma_message_no_user() storage/maria/ma_checkpoint.h: function for any Maria task, not connected to a user (example: checkpoint, recovery; soon could be deleted records purger) to report a message (calls my_printf_error() which, when inside ha_maria, leads to sql_print_*(), and when outside, leads to my_message_no_curses i.e. stderr). storage/maria/ma_recovery.c: To tell that recovery starts and ends we use ma_message_no_user() (sql_print_*() in practice). Detailed progress info still uses stderr as sql_print() cannot put several messages on one line. 071116 18:42:16 [Note] mysqld: Maria engine: starting recovery recovered pages: 0% 67% 100% (0.0 seconds); transactions to roll back: 1 0 (0.0 seconds); tables to flush: 1 0 (0.0 seconds); 071116 18:42:16 [Note] mysqld: Maria engine: recovery done storage/maria/maria_chk.c: my_progname_short moved to mysys storage/maria/maria_read_log.c: my_progname_short moved to mysys storage/myisam/myisamchk.c: my_progname_short moved to mysys
2007-11-16 17:09:51 +01:00
print_preamble();
for (count= 0, list_element= maria_open_list ;
list_element ; count++, (list_element= list_element->next))
fprintf(stderr, "tables to flush:");
recovery_message_printed= REC_MSG_FLUSH;
}
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
/*
Since the end of end_of_redo_phase(), we may have written new records
(if UNDO phase ran) and thus the state is newer than at
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
end_of_redo_phase(), we need to bump is_of_horizon again.
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
*/
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
TRANSLOG_ADDRESS addr= translog_get_horizon();
for (list_element= maria_open_list ; ; list_element= next_open)
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
{
if (recovery_message_printed == REC_MSG_FLUSH)
fprintf(stderr, " %u", count--);
if (list_element == NULL)
break;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
next_open= list_element->next;
info= (MARIA_HA*)list_element->data;
pthread_mutex_unlock(&THR_LOCK_maria); /* ok, UNDO phase not online yet */
WL#3071 Maria checkpoint Ability for flush_pagecache_blocks() to flush only certain pages of a file, as instructed by an option "filter" pointer-to-function argument; Checkpoint and background dirty page flushing use that to flush only pages which have been dirty for long enough and bitmap pages. Fix for a bug in flush_cached_blocks() (no idea if it could produce a bug in real life, but theoretically it is). Testing checkpoint in ma_test_recovery via ma_test1 and ma_test2. Background checkpoint & dirty pages flush thread is still disabled by default in ha_maria. mysql-test/r/maria.result: result update storage/maria/ha_maria.cc: blank after function comment storage/maria/ma_checkpoint.c: Using an enum instead of 0/1/2 (applying Sanja's review comments). The comment about "this is an horizon" can be removed as Sanja created translog_next_LSN() which parse_checkpoint_record() uses. Variables in ma_checkpoint_background() cannot be declared in the for() as their value must not be reset at each iteration! storage/maria/ma_pagecache.c: adding to flush_pagecache_blocks() optional arguments 'filter' (pointer to function) and 'filter_arg'; if filter!=NULL this function will be called for each block of the file and will reply if this block and following ones should be flushed or not (3 possible replies). Fixing a bug when flush_cached_blocks() skips a pinned page: it has to unset PCBLOCK_IN_FLUSH set by flush_pagecache_blocks_int(). storage/maria/ma_pagecache.h: flush_pagecache_blocks() is changed to take "filter" and "filter_arg" arguments. "filter", if it is not NULL, may return one value among enum pagecache_flush_filter_result. storage/maria/ma_recovery.c: open_count=0 when closing tables at the end of recovery. storage/maria/ma_test1.c: Optional checkpoints (-H#) at various stages (stages similar to --testflag), for testing of checkpoints. storage/maria/ma_test2.c: Optional checkpoints (-H#) at various stages (stages similar to -t), for testing of checkpoints. storage/maria/ma_test_recovery.expected: Result update: the results of the additional test run with -H# (checkpoints) are added here. They are exactly identical to without checkpoints except that the index's Root (printed by maria_chk) is more correct when using checkpoints. This is because checkpoint flushed the state, so it happens to be correct, while no-checkpoint does not flush the state, and recovery does not recover indexes so Root is never fixed. When we recover indices, this will go away. storage/maria/ma_test_recovery: We duplicate the loop of tests to add an additional run with checkpoints at various stages, to see if maria_read_log uses them fine.
2007-10-17 16:55:26 +02:00
/*
Tables which we see here are exactly those which were open at time of
crash. They might have open_count>0 as Checkpoint maybe flushed their
state while they were used. As Recovery corrected them, don't alarm the
user, don't ask for a table check:
*/
info->s->state.open_count= 0;
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
prepare_table_for_close(info, addr);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
error|= maria_close(info);
pthread_mutex_lock(&THR_LOCK_maria);
}
end:
pthread_mutex_unlock(&THR_LOCK_maria);
return error;
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
Remove SAFE_MODE for opt_range as it disables UPDATE to use keys REDO optimization (Bascily avoid moving blocks from/to pagecache) More command line arguments to maria_read_log Fixed recovery bug when recreating table sql/opt_range.cc: Remove SAFE_MODE for opt_range as it disables UPDATE to use keys storage/maria/ma_blockrec.c: REDO optimization Use new interface for pagecache_reads to avoid copying page buffers storage/maria/ma_loghandler.c: Patch from Sanja: - Added new parameter to translog_get_page to use direct links to pagecache - Changed scanner to be able to use direct links This avoids a lot of calls to bmove512() in page cache. storage/maria/ma_loghandler.h: Added direct link to pagecache objects storage/maria/ma_open.c: Added const to parameter Added missing braces storage/maria/ma_pagecache.c: From Sanja: - Added direct links to pagecache (from pagecache_read()) Dirrect link means that on pagecache_read we get back a pointer to the pagecache buffer From Monty: - Fixed arguments to init_page_cache to handle big page caches - Fixed compiler warnings - Replaced PAGECACHE_PAGE_LINK with PAGECACHE_BLOCK_LINK * to catch errors storage/maria/ma_pagecache.h: Changed block numbers from int to long to be able to handle big page caches Changed some PAGECACHE_PAGE_LINK to PAGECACHE_BLOCK_LINK storage/maria/ma_recovery.c: Fixed recovery bug when recreating table (table was kept open) Moved some variables to function start (portability) Added space to some print messages storage/maria/maria_chk.c: key_buffer_size -> page_buffer_size storage/maria/maria_def.h: Changed default page_buffer_size to 10M storage/maria/maria_read_log.c: Added more startup options: --version --undo (apply undo) --page_cache_size (to run with big cache sizes) --silent (to not get any output from --apply) storage/maria/unittest/ma_control_file-t.c: Fixed compiler warning storage/maria/unittest/ma_test_loghandler-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Added new argument to translog_init_scanner()
2007-09-27 13:18:28 +02:00
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
/**
@brief Close all table instances with a certain name which are present in
all_tables.
@param name Name of table
@param addr Log address passed to prepare_table_for_close()
*/
Remove SAFE_MODE for opt_range as it disables UPDATE to use keys REDO optimization (Bascily avoid moving blocks from/to pagecache) More command line arguments to maria_read_log Fixed recovery bug when recreating table sql/opt_range.cc: Remove SAFE_MODE for opt_range as it disables UPDATE to use keys storage/maria/ma_blockrec.c: REDO optimization Use new interface for pagecache_reads to avoid copying page buffers storage/maria/ma_loghandler.c: Patch from Sanja: - Added new parameter to translog_get_page to use direct links to pagecache - Changed scanner to be able to use direct links This avoids a lot of calls to bmove512() in page cache. storage/maria/ma_loghandler.h: Added direct link to pagecache objects storage/maria/ma_open.c: Added const to parameter Added missing braces storage/maria/ma_pagecache.c: From Sanja: - Added direct links to pagecache (from pagecache_read()) Dirrect link means that on pagecache_read we get back a pointer to the pagecache buffer From Monty: - Fixed arguments to init_page_cache to handle big page caches - Fixed compiler warnings - Replaced PAGECACHE_PAGE_LINK with PAGECACHE_BLOCK_LINK * to catch errors storage/maria/ma_pagecache.h: Changed block numbers from int to long to be able to handle big page caches Changed some PAGECACHE_PAGE_LINK to PAGECACHE_BLOCK_LINK storage/maria/ma_recovery.c: Fixed recovery bug when recreating table (table was kept open) Moved some variables to function start (portability) Added space to some print messages storage/maria/maria_chk.c: key_buffer_size -> page_buffer_size storage/maria/maria_def.h: Changed default page_buffer_size to 10M storage/maria/maria_read_log.c: Added more startup options: --version --undo (apply undo) --page_cache_size (to run with big cache sizes) --silent (to not get any output from --apply) storage/maria/unittest/ma_control_file-t.c: Fixed compiler warning storage/maria/unittest/ma_test_loghandler-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Added new argument to translog_init_scanner()
2007-09-27 13:18:28 +02:00
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
static my_bool close_one_table(const char *name, TRANSLOG_ADDRESS addr)
Remove SAFE_MODE for opt_range as it disables UPDATE to use keys REDO optimization (Bascily avoid moving blocks from/to pagecache) More command line arguments to maria_read_log Fixed recovery bug when recreating table sql/opt_range.cc: Remove SAFE_MODE for opt_range as it disables UPDATE to use keys storage/maria/ma_blockrec.c: REDO optimization Use new interface for pagecache_reads to avoid copying page buffers storage/maria/ma_loghandler.c: Patch from Sanja: - Added new parameter to translog_get_page to use direct links to pagecache - Changed scanner to be able to use direct links This avoids a lot of calls to bmove512() in page cache. storage/maria/ma_loghandler.h: Added direct link to pagecache objects storage/maria/ma_open.c: Added const to parameter Added missing braces storage/maria/ma_pagecache.c: From Sanja: - Added direct links to pagecache (from pagecache_read()) Dirrect link means that on pagecache_read we get back a pointer to the pagecache buffer From Monty: - Fixed arguments to init_page_cache to handle big page caches - Fixed compiler warnings - Replaced PAGECACHE_PAGE_LINK with PAGECACHE_BLOCK_LINK * to catch errors storage/maria/ma_pagecache.h: Changed block numbers from int to long to be able to handle big page caches Changed some PAGECACHE_PAGE_LINK to PAGECACHE_BLOCK_LINK storage/maria/ma_recovery.c: Fixed recovery bug when recreating table (table was kept open) Moved some variables to function start (portability) Added space to some print messages storage/maria/maria_chk.c: key_buffer_size -> page_buffer_size storage/maria/maria_def.h: Changed default page_buffer_size to 10M storage/maria/maria_read_log.c: Added more startup options: --version --undo (apply undo) --page_cache_size (to run with big cache sizes) --silent (to not get any output from --apply) storage/maria/unittest/ma_control_file-t.c: Fixed compiler warning storage/maria/unittest/ma_test_loghandler-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Added new argument to translog_init_scanner()
2007-09-27 13:18:28 +02:00
{
my_bool res= 0;
/* There are no other threads using the tables, so we don't need any locks */
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
struct st_table_for_recovery *internal_table, *end;
for (internal_table= all_tables, end= internal_table + SHARE_ID_MAX + 1;
internal_table < end ;
internal_table++)
Remove SAFE_MODE for opt_range as it disables UPDATE to use keys REDO optimization (Bascily avoid moving blocks from/to pagecache) More command line arguments to maria_read_log Fixed recovery bug when recreating table sql/opt_range.cc: Remove SAFE_MODE for opt_range as it disables UPDATE to use keys storage/maria/ma_blockrec.c: REDO optimization Use new interface for pagecache_reads to avoid copying page buffers storage/maria/ma_loghandler.c: Patch from Sanja: - Added new parameter to translog_get_page to use direct links to pagecache - Changed scanner to be able to use direct links This avoids a lot of calls to bmove512() in page cache. storage/maria/ma_loghandler.h: Added direct link to pagecache objects storage/maria/ma_open.c: Added const to parameter Added missing braces storage/maria/ma_pagecache.c: From Sanja: - Added direct links to pagecache (from pagecache_read()) Dirrect link means that on pagecache_read we get back a pointer to the pagecache buffer From Monty: - Fixed arguments to init_page_cache to handle big page caches - Fixed compiler warnings - Replaced PAGECACHE_PAGE_LINK with PAGECACHE_BLOCK_LINK * to catch errors storage/maria/ma_pagecache.h: Changed block numbers from int to long to be able to handle big page caches Changed some PAGECACHE_PAGE_LINK to PAGECACHE_BLOCK_LINK storage/maria/ma_recovery.c: Fixed recovery bug when recreating table (table was kept open) Moved some variables to function start (portability) Added space to some print messages storage/maria/maria_chk.c: key_buffer_size -> page_buffer_size storage/maria/maria_def.h: Changed default page_buffer_size to 10M storage/maria/maria_read_log.c: Added more startup options: --version --undo (apply undo) --page_cache_size (to run with big cache sizes) --silent (to not get any output from --apply) storage/maria/unittest/ma_control_file-t.c: Fixed compiler warning storage/maria/unittest/ma_test_loghandler-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Added new argument to translog_init_scanner()
2007-09-27 13:18:28 +02:00
{
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
MARIA_HA *info= internal_table->info;
if ((info != NULL) && !strcmp(info->s->open_file_name, name))
Remove SAFE_MODE for opt_range as it disables UPDATE to use keys REDO optimization (Bascily avoid moving blocks from/to pagecache) More command line arguments to maria_read_log Fixed recovery bug when recreating table sql/opt_range.cc: Remove SAFE_MODE for opt_range as it disables UPDATE to use keys storage/maria/ma_blockrec.c: REDO optimization Use new interface for pagecache_reads to avoid copying page buffers storage/maria/ma_loghandler.c: Patch from Sanja: - Added new parameter to translog_get_page to use direct links to pagecache - Changed scanner to be able to use direct links This avoids a lot of calls to bmove512() in page cache. storage/maria/ma_loghandler.h: Added direct link to pagecache objects storage/maria/ma_open.c: Added const to parameter Added missing braces storage/maria/ma_pagecache.c: From Sanja: - Added direct links to pagecache (from pagecache_read()) Dirrect link means that on pagecache_read we get back a pointer to the pagecache buffer From Monty: - Fixed arguments to init_page_cache to handle big page caches - Fixed compiler warnings - Replaced PAGECACHE_PAGE_LINK with PAGECACHE_BLOCK_LINK * to catch errors storage/maria/ma_pagecache.h: Changed block numbers from int to long to be able to handle big page caches Changed some PAGECACHE_PAGE_LINK to PAGECACHE_BLOCK_LINK storage/maria/ma_recovery.c: Fixed recovery bug when recreating table (table was kept open) Moved some variables to function start (portability) Added space to some print messages storage/maria/maria_chk.c: key_buffer_size -> page_buffer_size storage/maria/maria_def.h: Changed default page_buffer_size to 10M storage/maria/maria_read_log.c: Added more startup options: --version --undo (apply undo) --page_cache_size (to run with big cache sizes) --silent (to not get any output from --apply) storage/maria/unittest/ma_control_file-t.c: Fixed compiler warning storage/maria/unittest/ma_test_loghandler-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Added new argument to translog_init_scanner()
2007-09-27 13:18:28 +02:00
{
prepare_table_for_close(info, addr);
if (maria_close(info))
res= 1;
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
internal_table->info= NULL;
Remove SAFE_MODE for opt_range as it disables UPDATE to use keys REDO optimization (Bascily avoid moving blocks from/to pagecache) More command line arguments to maria_read_log Fixed recovery bug when recreating table sql/opt_range.cc: Remove SAFE_MODE for opt_range as it disables UPDATE to use keys storage/maria/ma_blockrec.c: REDO optimization Use new interface for pagecache_reads to avoid copying page buffers storage/maria/ma_loghandler.c: Patch from Sanja: - Added new parameter to translog_get_page to use direct links to pagecache - Changed scanner to be able to use direct links This avoids a lot of calls to bmove512() in page cache. storage/maria/ma_loghandler.h: Added direct link to pagecache objects storage/maria/ma_open.c: Added const to parameter Added missing braces storage/maria/ma_pagecache.c: From Sanja: - Added direct links to pagecache (from pagecache_read()) Dirrect link means that on pagecache_read we get back a pointer to the pagecache buffer From Monty: - Fixed arguments to init_page_cache to handle big page caches - Fixed compiler warnings - Replaced PAGECACHE_PAGE_LINK with PAGECACHE_BLOCK_LINK * to catch errors storage/maria/ma_pagecache.h: Changed block numbers from int to long to be able to handle big page caches Changed some PAGECACHE_PAGE_LINK to PAGECACHE_BLOCK_LINK storage/maria/ma_recovery.c: Fixed recovery bug when recreating table (table was kept open) Moved some variables to function start (portability) Added space to some print messages storage/maria/maria_chk.c: key_buffer_size -> page_buffer_size storage/maria/maria_def.h: Changed default page_buffer_size to 10M storage/maria/maria_read_log.c: Added more startup options: --version --undo (apply undo) --page_cache_size (to run with big cache sizes) --silent (to not get any output from --apply) storage/maria/unittest/ma_control_file-t.c: Fixed compiler warning storage/maria/unittest/ma_test_loghandler-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Added new argument to translog_init_scanner()
2007-09-27 13:18:28 +02:00
}
}
return res;
}
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
/**
Temporarily disables logging for this table.
If that makes the log incomplete, writes a LOGREC_INCOMPLETE_LOG to the log
to warn log readers.
@param info table
@param log_incomplete if that disabling makes the log incomplete
@note for example in the REDO phase we disable logging but that does not
make the log incomplete.
*/
void _ma_tmp_disable_logging_for_table(MARIA_HA *info,
my_bool log_incomplete)
{
MARIA_SHARE *share= info->s;
if (log_incomplete)
{
uchar log_data[FILEID_STORE_SIZE];
LEX_STRING log_array[TRANSLOG_INTERNAL_PARTS + 1];
LSN lsn;
log_array[TRANSLOG_INTERNAL_PARTS + 0].str= (char*) log_data;
log_array[TRANSLOG_INTERNAL_PARTS + 0].length= sizeof(log_data);
translog_write_record(&lsn, LOGREC_INCOMPLETE_LOG,
info->trn, info, sizeof(log_data),
TRANSLOG_INTERNAL_PARTS + 1, log_array,
log_data, NULL);
}
/* if we disabled before writing the record, record wouldn't reach log */
share->now_transactional= FALSE;
share->page_type= PAGECACHE_PLAIN_PAGE;
}
static void print_redo_phase_progress(TRANSLOG_ADDRESS addr)
{
static int end_logno= FILENO_IMPOSSIBLE, end_offset, percentage_printed= 0;
static ulonglong initial_remainder= -1;
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
int cur_logno, cur_offset;
ulonglong local_remainder;
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
int percentage_done;
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
if (tracef == stdout)
return;
if (recovery_message_printed == REC_MSG_NONE)
{
WL#3071 Maria checkpoint, WL#3072 Maria recovery instead of fprintf(stderr) when a task (with no user connected) gets an error, use my_printf_error(). Flags ME_JUST_WARNING and ME_JUST_INFO added to my_error()/my_printf_error(), which pass it to my_message_sql() which is modified to call the appropriate sql_print_*(). This way recovery can signal its start and end with [Note] and not [ERROR] (but failure with [ERROR]). Recovery's detailed progress (percents etc) still uses stderr as they have to stay on one single line. sql_print_error() changed to use my_progname_short (nicer display). mysql-test-run.pl --gdb/--ddd does not run mysqld, because a breakpoint in mysql_parse is too late to debug startup problems; instead, dev should set the breakpoints it wants and then "run" ("r"). include/my_sys.h: new flags to tell error_handler_hook that this is not an error but an information or warning mysql-test/mysql-test-run.pl: when running with --gdb/--ddd to debug mysqld, breaking at mysql_parse is too late to debug startup problems; now, it does not run mysqld, does not set breakpoints, developer can set as early breakpoints as it wants and is responsible for typing "run" (or "r") mysys/my_init.c: set my_progname_short mysys/my_static.c: my_progname_short added sql/mysqld.cc: * my_message_sql() can now receive info or warning, not only error; this allows mysys to tell the user (or the error log if no user) about an info or warning. Used from Maria. * plugins (or engines like Maria) may want to call my_error(), so set up the error handler hook (my_message_sql) before initializing plugins; otherwise they get my_message_no_curses which is less integrated into mysqld (is just fputs()) * using my_progname_short instead of my_progname, in my_message_sql() (less space on screen) storage/maria/ma_checkpoint.c: fprintf(stderr) -> ma_message_no_user() storage/maria/ma_checkpoint.h: function for any Maria task, not connected to a user (example: checkpoint, recovery; soon could be deleted records purger) to report a message (calls my_printf_error() which, when inside ha_maria, leads to sql_print_*(), and when outside, leads to my_message_no_curses i.e. stderr). storage/maria/ma_recovery.c: To tell that recovery starts and ends we use ma_message_no_user() (sql_print_*() in practice). Detailed progress info still uses stderr as sql_print() cannot put several messages on one line. 071116 18:42:16 [Note] mysqld: Maria engine: starting recovery recovered pages: 0% 67% 100% (0.0 seconds); transactions to roll back: 1 0 (0.0 seconds); tables to flush: 1 0 (0.0 seconds); 071116 18:42:16 [Note] mysqld: Maria engine: recovery done storage/maria/maria_chk.c: my_progname_short moved to mysys storage/maria/maria_read_log.c: my_progname_short moved to mysys storage/myisam/myisamchk.c: my_progname_short moved to mysys
2007-11-16 17:09:51 +01:00
print_preamble();
fprintf(stderr, "recovered pages: 0%%");
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
procent_printed= 1;
recovery_message_printed= REC_MSG_REDO;
}
if (end_logno == FILENO_IMPOSSIBLE)
{
LSN end_addr= translog_get_horizon();
end_logno= LSN_FILE_NO(end_addr);
end_offset= LSN_OFFSET(end_addr);
}
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
cur_logno= LSN_FILE_NO(addr);
cur_offset= LSN_OFFSET(addr);
local_remainder= (cur_logno == end_logno) ? (end_offset - cur_offset) :
(((longlong)log_file_size) - cur_offset +
max(end_logno - cur_logno - 1, 0) * ((longlong)log_file_size) +
end_offset);
if (initial_remainder == (ulonglong)(-1))
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
initial_remainder= local_remainder;
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
percentage_done= ((initial_remainder - local_remainder) * ULL(100) /
initial_remainder);
if ((percentage_done - percentage_printed) >= 10)
{
percentage_printed= percentage_done;
fprintf(stderr, " %d%%", percentage_done);
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
procent_printed= 1;
}
}
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
#ifdef MARIA_EXTERNAL_LOCKING
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
#error Marias Checkpoint and Recovery are really not ready for it
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
#endif
/*
Recovery of the state : how it works
=====================================
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
Here we ignore Checkpoints for a start.
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
The state (MARIA_HA::MARIA_SHARE::MARIA_STATE_INFO) is updated in
memory frequently (at least at every row write/update/delete) but goes
to disk at few moments: maria_close() when closing the last open
instance, and a few rare places like CHECK/REPAIR/ALTER
(non-transactional tables also do it at maria_lock_database() but we
needn't cover them here).
In case of crash, state on disk is likely to be older than what it was
in memory, the REDO phase needs to recreate the state as it was in
memory at the time of crash. When we say Recovery here we will always
mean "REDO phase".
For example MARIA_STATUS_INFO::records (count of records). It is updated at
the end of every row write/update/delete/delete_all. When Recovery sees the
sign of such row operation (UNDO or REDO), it may need to update the records'
count if that count does not reflect that operation (is older). How to know
the age of the state compared to the log record: every time the state
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
goes to disk at runtime, its member "is_of_horizon" is updated to the
current end-of-log horizon. So Recovery just needs to compare is_of_horizon
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
and the record's LSN to know if it should modify "records".
Other operations like ALTER TABLE DISABLE KEYS update the state but
don't write log records, thus the REDO phase cannot repeat their
effect on the state in case of crash. But we make them sync the state
as soon as they have finished. This reduces the window for a problem.
It looks like only one thread at a time updates the state in memory or
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
on disk. We assume that the upper level (normally MySQL) has protection
against issuing HA_EXTRA_(FORCE_REOPEN|PREPARE_FOR_RENAME) so that these
are not issued while there are any running transactions on the given table.
If this is not done, we may write a corrupted state to disk.
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
With checkpoints
================
Checkpoint module needs to read the state in memory and write it to
disk. This may happen while some other thread is modifying the state
in memory or on disk. Checkpoint thus may be reading changing data, it
needs a mutex to not have it corrupted, and concurrent modifiers of
the state need that mutex too for the same reason.
"records" is modified for every row write/update/delete, we don't want
to add a mutex lock/unlock there. So we re-use the mutex lock/unlock
which is already present in these moments, namely the log's mutex which is
taken when UNDO_ROW_INSERT|UPDATE|DELETE is written: we update "records" in
under-log-mutex hooks when writing these records (thus "records" is
not updated at the end of maria_write/update/delete() anymore).
Thus Checkpoint takes the log's lock and can read "records" from
memory an write it to disk and release log's lock.
We however want to avoid having the disk write under the log's
lock. So it has to be under another mutex, natural choice is
intern_lock (as Checkpoint needs it anyway to read MARIA_SHARE::kfile,
and as maria_close() takes it too). All state writes to disk are
changed to be protected with intern_lock.
So Checkpoint takes intern_lock, log's lock, reads "records" from
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
memory, releases log's lock, updates is_of_horizon and writes "records" to
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
disk, release intern_lock.
In practice, not only "records" needs to be written but the full
state. So, Checkpoint reads the full state from memory. Some other
thread may at this moment be modifying in memory some pieces of the
state which are not protected by the lock's log (see ma_extra.c
HA_EXTRA_NO_KEYS), and Checkpoint would be reading a corrupted state
from memory; to guard against that we extend the intern_lock-zone to
changes done to the state in memory by HA_EXTRA_NO_KEYS et al, and
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
also any change made in memory to create_rename_lsn/state_is_of_horizon.
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
Last, we don't want in Checkpoint to do
log lock; read state from memory; release log lock;
for each table, it may hold the log's lock too much in total.
So, we instead do
log lock; read N states from memory; release log lock;
Thus, the sequence above happens outside of any intern_lock.
But this re-introduces the problem that some other thread may be changing the
state in memory and on disk under intern_lock, without log's lock, like
HA_EXTRA_NO_KEYS, while we read the N states. However, when Checkpoint later
comes to handling the table under intern_lock, which is serialized with
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
HA_EXTRA_NO_KEYS, it can see that is_of_horizon is higher then when the state
was read from memory under log's lock, and thus can decide to not flush the
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
obsolete state it has, knowing that the other thread flushed a more recent
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
state already. If on the other hand is_of_horizon is not higher, the read
state is current and can be flushed. So we have a per-table sequence:
lock intern_lock; test if is_of_horizon is higher than when we read the state
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
under log's lock; if no then flush the read state to disk.
*/
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
/* some comments and pseudo-code which we keep for later */
#if 0
/*
MikaelR suggests: support checkpoints during REDO phase too: do checkpoint
after a certain amount of log records have been executed. This helps
against repeated crashes. Those checkpoints could not be user-requested
(as engine is not communicating during the REDO phase), so they would be
automatic: this changes the original assumption that we don't write to the
log while in the REDO phase, but why not. How often should we checkpoint?
*/
/*
We want to have two steps:
engine->recover_with_max_memory();
next_engine->recover_with_max_memory();
engine->init_with_normal_memory();
next_engine->init_with_normal_memory();
So: in recover_with_max_memory() allocate a giant page cache, do REDO
phase, then all page cache is flushed and emptied and freed (only retain
small structures like TM): take full checkpoint, which is useful if
next engine crashes in its recovery the next second.
Destroy all shares (maria_close()), then at init_with_normal_memory() we
do this:
*/
/**** UNDO PHASE *****/
/*
Launch one or more threads to do the background rollback. Don't wait for
them to complete their rollback (background rollback; for debugging, we
can have an option which waits). Set a counter (total_of_rollback_threads)
to the number of threads to lauch.
Note that InnoDB's rollback-in-background works as long as InnoDB is the
last engine to recover, otherwise MySQL will refuse new connections until
the last engine has recovered so it's not "background" from the user's
point of view. InnoDB is near top of sys_table_types so all others
(e.g. BDB) recover after it... So it's really "online rollback" only if
InnoDB is the only engine.
*/
/* wake up delete/update handler */
/* tell the TM that it can now accept new transactions */
/*
mark that checkpoint requests are now allowed.
*/
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
#endif